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RECALL OF COMPLETED AND INCOMPLETED ACTIVITIES 
UNDER VARYING DEGREES OF STRESS * 


BY ALFRED F. GLIXMAN 


University of Mississippi 


INTRODUCTION 


Of the many dynamisms postulated 
by psychoanalysts, ‘repression’ has 
been most attractive to psychologists 
who engage in research in the field of 
personality. This is probably a con- 
sequence of the fact that repression 
has been considered, by Freud and his 
followers, as a cardinal mechanism 
from the viewpoint of both the thera- 
pist and the theorist. In addition to 
this, there is an apparent unambiguity 
of manifestation which appeals to the 
research-oriented psychologist; i.e., a 
minimum requirement for the demon- 
stration of repression is some kind of 
decrement of recall. It is necessary 
to demonstrate a decrement of recall 
before it can be argued that repression 
has been produced experimentally. 
This does not mean, however, that a 
loss of recall is the only criterion for 
the presence of repression; there are 
other criteria which will be discussed 
later. Despite the existence of a 
large number of experiments designed 


* Based on a dissertation submitted in partial 
fulfillment of the requirements for the Ph.D. 
at the University of California at Berkeley. The 
writer wishes to express his gratitude to Pro- 
fessor Warner Brown for his many helpful 
suggestions. 


to study repression, there is a lack of 
agreement regarding the results ob- 
tained and the conclusions which 
stem from them. This study at- 
tempts to present a clear-cut descrip- 
tion of the effects of stress (threat to 
self-esteem) upon the recall of com- 
pleted and incompleted activities. 
Sears (19) and Rapaport (13) have 
presented critical reviews of most of 
the literature dealing with the effects 
of emotional factors on recall. A 
large part of this literature is devoted 
to the difference between recall of 
pleasant and of unpleasant material. 
There is general agreement that this 
approach is irrelevant to the problem 
of repression because the kind of 
hedonic tone involved is so different 
from the kind discussed by the psy- 
choanalysts. Another approach is 
exemplified by the studies of Flanagan 
and Sharp (cited in 19). Flanagan 
compared the learning, immediate 
recall, and delayed recall of paired 
associates the combination of which 
yielded sexual meanings with the 
same measures of neutral paired as- 
sociates. Sharp used the same tech- 
nique with paired associates having 
religious or profane connotations. In 
both experiments, the neutral lists 
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were more quickly learned and better 
recalled. The experiments were crit- 
icized correctly by Sears (19, p. 114f) 
because “the conditions under which 
the experiments were performed did 
not preclude the possibility that the 
differences was [sic] caused by em- 
barrassment and conscious reluctance 
to speak forbidden words in the pres- 
ence of the experimenter.” 

Sears (18) attacked the problem by 
inducing feelings of success and failure 
for card-sorting performance. He 
found that the performance of his 
‘failure’ group was less efficient than 
that of his ‘success’ group. The 
‘failure’ group also was less efficient 
in the learning of nonsense syllables 
which were presented throughout the 
course of the experiment. 

The most vigorous approach is 
apparent in the studies of Rosenzweig 
and Mason (15), Rosenzweig (16), 
Sanford (17), and Alper (1). Rosen- 


zweig and Mason found that “given 


an individual of sufficient intellectual 
maturity and a commensurate meas- 
ure of pride, experiences that are un- 
pleasant because they wound self- 
respect... are... less apt to be 
remembered than experiences that 
are gratifying to the ego.” Rosen- 
zweig (16) found that in an ‘informal’ 
experimental situation—i.e., a situ- 
ation where the subject’s ‘self-esteem’ 
was not threatened—a preponderance 
of subjects recalled more interrupted 
activities than completed ones. In 
the ‘formal’ situation, a preponder- 
ance of subjects recalled fewer inter- 
rupted activities than completed ones. 
These findings are offered by the 
authors as evidence to demonstrate 
repression. Alper (1) found no reli- 
able difference between recall of com- 
pleted and incompleted tasks. She 
promises to examine the nature of the 
individual differences in a future 
paper. 
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While Rosenzweig and Mason (15) 
found that the more mature subjects 
(defined in terms of C.A. and M.A.) 
had a greater tendency to choose for 
repetition completed rather than 
failed (interrupted) activities, San- 
ford (17) presents evidence that the 
more mature subjects have a greater 
tendency to persist in the completion 
of failed activities than do the more 
immature subjects. He also finds 
that the more mature subjects recall 
more failures than successes. 

Unless a way is found to reconcile 
these results, they will remain of little 
use for an experimental approach to 
repression. As an initial step to- 
wards a reconciliation, the differences 
and similarities in approach will be 
considered. 

The four experiments just discussed 
hold in common a basic assumption 
and, as a result, a mode of attack. 
The finding of Zeigarnik (21)! that 
interrupted activities are recalled 
more often than completed activities 
in situations where the interruption is 
not regarded as a threat to the sub- 
ject’s self-esteem is taken as a starting 
point. The assumption is then made 
that if a subject is placed in a situ- 
ation where the interruption of a 
task threatens his self-esteem, and if 
this results in a decrease or reversal of 
the recall inequality (i.e., the differ- 
ence between recall of incompleted 
and completed tasks), then there is 
evidence for the existence of repres- 
sion.2, The mode of attack is char- 


1 For reviews of the literature on recall and 
resumption of interrupted and completed activi- 
ties, see Lewin (6) and Prentice (12). For an 
investigation of the factors that influence re- 
call of incompleted and completed tasks, see 
Pachauri (11). 

2 Alper uses the term “selective recall” rather 
than “repression.” Although she hypothesizes 
that there should be no recall inequality for a 
group of Ss who have been randomly selected 
with respect to personality characteristics, she 
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acterized by the following procedure: 
The subject is asked to perform a 
number of tasks which can be used 
equally well in threatening and non- 
threatening situations. (Sanford, 
Rosenzweig and Mason, and Rosen- 
zweig all used jigsaw puzzles; Alper 
used scrambled sentences.) The com- 
pletion and incompletion of the tasks 
is controlled by the experimenter; i.e., 
he makes sure there are as many in- 
completions as completions. (Alper 
did not control the number of com- 
pletions.) Recall is asked for im- 
mediately after the last activity has 
stopped. The recall is introduced as 
though it were incidental to the rest 
of the procedure. The recall in- 
equality in the threatening situation 
is compared with the inequality in a 
control situation which is part of the 
experimental plan (Rosenzweig, Al- 
per), or else it is compared with the 
results provided by Zeigarnik. The 
threat is introduced by presenting the 
tasks in a manner which makes it clear 
to the subject that his ability will be 
evaluated in terms of his performance. 

With this similarity in approach, 
the disagreement in results seems sur- 
prising. These discrepancies become 
even harder to evaluate when addi- 
tional considerations are taken into 
account. Alper’s results are based 
on 10 subjects. She also had 10 con- 
trol subjects who were used only in 
the non-threatening situation to de- 
monstrate the comparability of differ- 
ent sets of sentences. In fairness to 
her, it should be pointed out that, 
since her subjects appeared in both 
situations, the efficiency of her experi- 
ment is greater than is indicated by 
the small number of subjects. On 


has accepted implicitly the assumption outlined; 
i.e., she has classified her Ss on the basis of the 
direction of the shift of the recall inequality when 
the threatening situation is compared with a non- 
threatening one. 
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the other hand, since there is an in- 
vestment of 400 hours of clinical 
examination (1, p. 405, footnote 3) 
which will be used to account for the 
results based on only 10 subjects—no 
matter how efficiently used—the im- 
plied criticism seems justified. 

Sanford’s subjects were seven to 
fifteen years old; the subjects of 
Rosenzweig and Mason were five 
years and six months to fourteen years 
and eight months old. The M.A. 
range for Sanford’s subjects was six 
to twenty-one; for Rosenzweig and 
Mason’s subjects it was four years 
and two months to thirteen years and 
two months. Sanford’s subjects were 
“normal, healthy children in a private 
school”; the subjects of Rosenzweig 
and Mason were “inmates of the New 
England Peabody Home for Crippled 
Children.” The subjects of both 
Rosenzweig and Alper were adults. 

Rosenzweig and Mason (15) and 
Sanford (17) discuss the relationship 
between resumption and recall. For 
Rosenzweig and Mason, ‘resumption’ 
refers to the subject’s repetition- 
choice with the opportunity to repeat 
a task actually present; for Sanford, 
‘resumption’ refers to the subject’s 
insistence on working with the puzzle 
even though he had been told to stop. 

The general confusion surrounding 
the results just presented would seem 
to justify another study which is de- 
signed to throw light on some of the 
points in question. The present in- 
vestigation makes this attempt. Its 
point of departure is rooted in two 
biases: 

1. None of the reviewed experi- 
ments has dealt with repression as the 
psychoanalysts talk of it. It may 
be that the kind of forgetting that 
takes place in these experiments is re- 
lated to the kind of forgetting which 
is described in clinical reports, but 
there does not seem to be any justifi- 
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cation for considering the two kinds 
of forgetting as synonymous at this 
stage of knowledge. As Freud (4; 2, 
p. 148f.) presents the concept of ‘re- 
pression,’ three criteria for the judg- 
ment of its presence appear: (1) there 
must be an attempt to find gratifica- 
tion for a need; (2) there must be an 
opposing need such that its gratifica- 
tion makes the gratification of the 
first need painful; (3) as a result of 
this opposition of needs, the following 
symptoms appear: loss of recall of the 
original need, or loss of recall of 
ideational material associated with 
the conflict-situation, or both. It is 
the forgetting which takes place under 
these special conditions that is called 
‘repression.’ ® 

Whether one chooses to accept these 
as criteria for experimentally induced 
‘repression’ is dependent upon the 
desire to investigate repression as it is 
defined clinically. ‘Repression’ in an 


experiment based on an interruption- 


situation does not satisfy those cri- 
teria; i.e., the gratification of the need 
for self-esteem does not make the 
completion of a task painful. The 
same criticism would seem to hold for 
forgetting as a consequence of the 
feelings of failure induced by Sears 
(18); i.e., the feelings of failure do not 
prevent further sorting of cards. 
Even if they did, the discontinuation 
of card-sorting would not be com- 
parable, say, to the stopping of 
masturbation by a child because the 
practice displeases the mother. In 
the first instance it is the failure that 
is ‘bad’; in the second one, it is the 


3 These criteria represent a minimal set of 
formal characteristics; one might wish, for in- 
stance, to add a specified time between the 
‘traumatic’ event and the recall test. It is also 
possible to list a set of criteria which is concerned 
with the content of the material which is re- 
pressed. It may be that different factors are 
related to repression, depending upon the kind 
of material involved. 
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practice itself which is ‘bad.’ For 
this reason the writer prefers to think 
of this experiment as one concerned 
with ‘selective forgetting’ and not with 
repression. 

2. Working with groups of subjects 
is more efficient than working with 
individuals; the number of hours spent 
in experimenting is reduced sharply. 
If a situation could be structured to 
produce selective forgetting readily 
for a large number of subjects, it 
would be easier to determine the 
factors which influence selective for- 
getting. It would also make it easier 
to select large number of subjects for 
clinical examinations. For these rea- 
sons an attempt was made to find a 
real-life situation which would affect 
a large number of people. A tech- 
nique was devised which made pos- 
sible the use of groups of subjects 
rather than individuals. 

Since part of the controversy in the 
field is concerned with resumption, a 
secondary purpose of the experiment 
is to devise a group technique for 
resumption. This would make a 
comparison of recall and resumption 
effects possible. The primary pur- 
pose of the experiment is to determine 
the effects of increasing threat to 
self-esteem on recall of completed and 
incompleted activities. 


STATEMENT OF THE PROBLEM 


The experiment to be described is 
designed to study the effects of recall 
of completed and incompleted activi- 
ties in neutral and stress situations. 
The following assumptions are made: 

1. The experimenter’s attempt to 
produce a threat to the subject’s 
self-esteem is, for most subjects, suc- 
cessful. 

2. The order of magnitude of stress 
as the experimenter presents it repre- 
sents the degree of threat to the sub- 
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jects; i.e., subjects in the least stressful 
situation do rot feel that their self- 
esteem is threatened as much as do 
the subjects in the intermediate or 
most stressful situations. 

3- Much the same generalizations 
may be made about the effects of 
increasing stress whether the same 
group of subjects is used in situations 
of different degrees of stress, or 
whether a different group of subjects 
is used for each situation. 

In an attempt to bring the problems 
raised by previous investigators into 
sharper focus, the following predic- 
tions are made: 

I. The recall of incompleted activi- 
ties will decrease as the stress is in- 
creased. 

II. The recall of completed activi- 
ties will increase as the stress is in- 
creased. 

These predictions were tested by 
placing college students in three 


situations of varying degrees of threat 
to self-esteem. The subjects used two 
equivalent forms of 20 paper-and- 
pencil tests, and followed three differ- 
ent sequences of recall and resump- 
tion. 


PROCEDURE AND SUBJECTS 
Material 


Two forms of each of 20 pencil-and-paper 
tasks were used. There were five different orders 
of presentation of the tasks. The activities as 
they appeared in Order One‘ are: Word Con- 
struction, Analogies, Coin Problems, Scrambled 
Words, Name Comparisons, Proverbs, Number 
Series, Synonyms, Maze, Reversed Spelling, 
Logical Reasoning, Addition, Sentence Comple- 
tion, Code, Number Square, Letter Crossout, 
Scattered Numbers, Number Comparisons, 
Opposites, Cities. 

Four booklets (Task, Resumption, Recall, 
and Dummy) were given to each S in each of the 
groups. The order in which they were used 
differs for the different expermental treatments. 


* Because every order was not represented in 
every Situation X Sequence X Form cell, Order 
was not kept as a separate classification in the 
analysis of the data. See Table I. 
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Task booklet.—This booklet contained 20 
pages of mimeographed material. Each page 
consisted of one task and its instructions. There 
were covering sheets in the front and back to. 
prevent S from seeing the tasks prematurely. 
For Situation I, the face sheet was blank. For 
Situation II and III, the face sheet had a mimeo- 
graphed “Intellectual Alertness Inventory” on it. 

Resumption booklet.—This booklet was iden- 
tical in form with the Task Booklet. Each test 
appeared in the same position as its homologue, 
it had the same name, and it had similar ma- 
terial, but it was only one-half to three-fourths 
aslong. The length was shortened so that about 
as many activities might be resumed in a reason- 
able length of time as were recalled, and so that 
there would be no feeling of having to work a 
long time on tasks that had been done before. 
If the Task Booklet was Form A, the Resump- 
tion Booklet was Form B. 

Recall booklet.—This booklet contained 20 
blank pages, one for the recording of each re- 
called task. 

Dummy booklet.—This consisted of 20 blank 
pages. Its purpose was to prevent the Ss from 
knowing that recall or resumption, as the case 
may be, was the last thing they had to do. 


Experimental Plan 


The experimental design is based on a 
three-way classification (Situation X Sequence 
X Form) with five individuals in each three- 
way cell. 

Situations.—There are three Situations which 
are intended as a continuum of stress; i.e., as a 
continuum of increasing threat to self-esteem. 

Situation I.—This situation was intended to 
be ‘neutral.’ The attempt was made to hold to 
a minimum the probability of interpreting one’s 
performance on the tasks as a reflection of 
ability. The emphasis was placed on E’s inter- 
est in the tasks per se rather than on £’s interest 
in S’s performance.’ 

When the Ss appeared in the classroom in 
which the experiment was conducted, they were 
told to sit anywhere in the room. E stood in 
front of the desk, and spoke informally. 

This is the first part of an experiment which 

I am conducting and in which I will use several 

different tasks. I have to find out how long it 

takes to complete each one because the timing 


5 In the competitive atmosphere of the class- 
room it is almost impossible to make an S feel 
that his ability—or lack of it—is not responsible 
for his performance; an E can only try to de- 
emphasize the importance of the S’s performance 
as an evaluation of what the S can ‘really’ do. 
The ‘neutral’ situation, then, should be regarded 
only as less stressful than the other situations. 
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will be an important part of the experiment. 
I have a rough idea of how long they should 
take, but now I am trying to find out exactly 
what the timing should be. You people are 
here to give me an idea of how long I should 
expect people to work on these tasks. In the 
second part of the experiment, the scoring will 
be all-or-none. Each task will be scored as 
either all right or all wrong; there will be no 
partial credit. In order to get credit for a 
task, the subjects will have to finish all of it 
and get it all right. They will be told that 
there is a penalty for guessing, so I am going 
to ask you, too, not to guess. If you don’t 
know an answer, don’t put anything down. 
Each of you will be given four booklets face 
down. Leave them that way until you are 
told to turn them over. [After the booklets 
were distributed]: There is one task on each 
page of the first booklet, and there is enough 
room on the page in which to put your answer. 
Since the subjects later on will work under 
standard conditions, I am going to ask you to 
observe them, too. Work as quickly as you 
can. Be as accurate as you can. Speed and 
accuracy have equal weight in the scoring. 
Don’t guess. You will do one task at a time. 
I will give a signal to turn the page for each 
task. When the signal is given, turn the page 
and read the instructions at the top of the 
page; then begin working. Do not turn back 
to a previous page. Stop working as soon as 
you are told to stop. Stop immediately. If 
you finish before I say ‘stop,’ check your work. 


Situation IIT.—In this situation the attempt 
was made to make S feel that his performance on 
these tasks would, in some way, be used as a 
basis for evaluating his ability. There was a 
greater orientation towards the S than in Situa- 
tion I, and the emphasis on the tasks qua tasks 


was lessened. Whereas in Situation I the £ 
tried not to imply that the command to stop 
working was related to S’s ability, in Situation II 
there was the implication that if the S did not 
finish the task he had failed to give a satisfactory 
performance. The situation was structured as a 
test situation in which E acted as an administra- 
tor and proctor. He stood behind the desk and 
spoke formally. When the Ss came into the 
experimental room, £ said: 


Please seat yourselves so that there is at 
least one empty seat between you and your 
neighbor. We will follow the regular examina- 
tion procedure here. 

Now let me tell you why you arehere. You 
are no doubt aware that this university, along 
with most other universities, is very over- 
crowded. This is indicated by President 
Sproul’s statement that entrance requirements 
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should be raised, and by the fact that fewer 
out-of-state applications are being accepted. 
Another problem with which the university is 
faced is that of weeding out students who will 
be unsuccessful. From the administration’s 
point of view the earlier in a student’s career 
that this can be done, the better. Some of us 
in the Department of Psychology have been 
working on a test which will serve this purpose; 
i.e., it will tell us which students will not have 
successful careers. 

You may have learned that there are sev- 
eral ways of estimating an individual’s capaci- 
ties. One way that looks most promising to 
us is by means of an “Intellectual Alertness 
Inventory.” We have developed such a test 
to give us an idea of how rapidly you can grasp 
a situation and act in accordance with the in- 
structions given you. Right now we are inter- 
ested in developing norms for the test; we 
want to see how well college students in general 
can do. You people are part of a large group 
which will give us our standards. It is very 
important that you do well. Other students 
will be compared with you. 

Each of the tests will be scored all-or-none. 
Each test will be scored as either all right or 
all wrong; there will be no partial credit. In 
order to get credit for a test, you will have to 
finish all of it and get it all right. There will 
be a penalty for guessing, so don’t guess. If 
you don’t know an answer don’t put anything 
down. Work as quickly as you can. Work 
as accurately as youcan. Speed and accuracy 
have equal weight in the scoring. 

On the basis of the data we already have, 
enough time is allowed for each test so that you 
should be able to finish it. 

Each of you will be given four booklets face 
down. Leave them that way until you are 
told to turn them over. [After the booklets 
were distributed ]: There is one test on each 
page of the first booklet, and there is enough 
room on the page in which to put your answer. 

When the signal is given, turn the page and 
read the instructions at the top of the page; 
then begin working. Do not turn ahead to 
the next page until you are told todoso. Do 
not turn back to a previous page. Stop 
working as soon as you are told to stop. Do 
not try to finish a line or a sentence after you 
are told to stop. Stop immediately. If you 
finish before I say ‘stop,’ check your work. 

Work quickly. Do the best you can, 
Don’t guess. 


Situation I]].—In this situation, the attempt 
was made to make the relationship between S’s 
performance and his ability even more clearcut 
than in the second situation. All of the em- 
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phasis was placed on how well S did and on the 
importance of his doing well. Since it is as- 
sumed that as the demand for good performance 
increases the potential threat to self-esteem in- 
creases, Situation III should be the most stressful 
situation. 

The £’s manner was the same as it was in 
Situation Il; he made the same introductory 
remarks about following the examination pro- 
cedure. The instructions were exactly the same, 
with one exception: Instead of telling the Ss that 
they were part of a normative group, they were 
told that: “We have standardized this inventory, 
and we now want information about individuals. 
We want to see how well each of you can do. 
We will check your performance on this test with 
your grade-point average, which we will use as 
an indication of your success in school. It is 
very important that you do the best you can on 
this test.” 

At the end of the instructions, the following 
was repeated: “Work quickly. Do the best you 
can. Don’t guess. Remember, it is important 
that you do well.” 

Sequences.—‘Sequence’ refers to the order in 
which the booklets were used. 

Sequence 1: Task-resumption-recall.—The situ- 
ation was set and the booklets were distributed. 
The Ss started with the Task Booklet. Two 
min. were allowed for each of the 20 tasks. The 
Ss were forbidden to turn a page before the 
signal to turn was given. They were not al- 
lowed to return to an earlier page. Instructions 
to “check your work” were given to keep them 
occupied with the task for the full two-min. 
period. After the Task Booklets were finished, 
they were collected. The following instructions 
for resumption were given: 


It will take me about fifteen minutes to put 
these papers in order. In order to keep you 
from doing things which are irrelevant to our 
purposes here, I have [“‘we have” for Situa- 
tions II and III] provided you with a booklet 
which contains the same tasks as the ones that 
appeared in the first booklet. To keep you 
from getting bored slight changes have been 
made in the content of each activity. Doonly 
the ones you would like most todo. You can 
set your own pace, but I prefer that you keep 
working the whole time. The choice of ac- 
tivities is left entirely to you. 

It would be helpful to me if you numbered 
the tasks you do in the order in which you do 
them; I then could get an idea of which tasks 
you liked best. 


Immediately following the 15-min. resump- 
tion period, the Resumption Booklets were col- 
lected. Following the collection of these book- 
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lets, the instructions for recall were given (five 
min. were allotted to recall): 


Before we continue working, there is some- 
thing I want you to do. The next booklet is 
made up of a number of blank pages. Please 
tell me what tests [or “tasks” in Situation I] 
you remember. Write down the name of the 
test you remember. If you can’t remember 
the name of a test, put down a brief description 
or give a short sample of it. Make your de- 
scriptions or samples brief, but put down 
enough so that I can recognize the tests clearly. 
Put down one test on a page. After you have 
written the name of a test, turn the page. Do 
not look back on a previous page. 

Write the tests in the order in which you 
remember them; do not try to reproduce the 
order in which they were given. 


After the recall period, the Recall and Dummy 
Booklets were collected. 

Sequence 2: Task-recall-resumption.—The in- 
structions, timing, and collection procedure were 
the same as in Sequence 1. Sequence 2 differed 
from Sequence 1 only by having recall precede 
resumption instead of vice versa. 

Sequence 3: Task-interpolated activity-recall.— 
Sequence 3 is identical to Sequence 1 with one 
exception: i.e., instead of resumption falling 
between the initial activities and recall, a 15-min. 
neutral activity was interpolated. Sequence 3, 
therefore, may be used as a control for the time 
which elapsed between the use of the Task 
Booklet and the Recall Booklet in Sequence 1. 
The interpolated activity was the Kuder ‘Voca- 
tional Preference Record.’ The instructions for 
its use were: 


The next thing we are going to do is this: 
I shall read some questions from a vocational 
interest record. Each question has three 
choices which I shall read twice. Write your 
preferences on the sheet of blank paper which 
is provided for you. Write a “1” if you prefer 
the first choice, a “2” if you prefer the second 
choice, and a “3” if you prefer the third choice. 
You don’t have to number your answers; just 
make sure you keep them in order. 


The material was given orally to ensure uni- 
formity of pace. 

Sequence 4: Task-interpolated activity-resump- 
tion.—Sequence 4 is identical to Sequence 2, 
with one exception: i.e., instead of recall falling 
between the initial activities and resumption, a 
five-min. neutral activity was interpolated. The 
activity and the accompanying instructions were 
the same as in Sequence 3. 

Forms.—Two forms of material comparable 
in content and difficulty were used. For ex- 
ample: In ‘Synonyms’ a list of words and their 
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synonyms was selected; the odd words were put 
into Form A and the even words were put into 
Form B. The use of two forms was necessitated 
by the manner in which resumption was meas- 
ured in this study. 


Subjects 


The Ss were taken from the introductory 
course in Psychology which met during the first 
Summer Session in 1947. ‘They were told by the 
instructor that they were expected to devote a 
two-hour period to departmental research. The 
E had a program card for each student in the 
class. He prepared lists of names for the in- 
structor. At each meeting of the class—for the 
duration of the experiment—the instructor read 
the names of the students who would serve as Ss 
for that day and the day following. This pro- 
cedure was followed to lend prestige to the calling 
for Ss. It also served to insure that students 
would come to the experiment, and it discouraged 
bias of sampling which is possible when Ss 
volunteer. 

Although 135 Ss appeared in the experimental 
sessions, the data of only 120 were used. The 
data of four Ss were discarded because they ap- 
parently misunderstood the instructions for re- 
call.6 The data of two more Ss were rejected 
because the Ss were clearly too old to be typical 
of the student group. Nine records were dis- 
carded so that there would be equal numbers of 
Ss in each of the subgroups to be considered. 
None of these nine records was examined prior 
to rejection. 

In order to maintain anonymity in the first 
two situations, most of the Ss did not indicate 
name or sex on their data sheets. Since the sex 
of a reject was unknown, the exact ratio of male 
to female is indeterminate. Our best guess is 
that the number of males was between go 
and 100. 

Of the 120 Ss (all of whom attended the sum- 
mer session), 102 had attended the regular ses- 
sion of the university, and 18 were either in the 
Extension Division or were awaiting admission 
to the regular session. 

No S appeared in more than one situation, 
went through more than one sequence, or used 
more than one form of the Task Booklet. The 
Ss were assigned to groups on the basis of their 
free hours, and each group of Ss was assigned to 
a treatment in a random manner. 

Since summer-session classes, in the main, 


6 It looked as though they had interpreted the 
recall instructions to be a request for free associa- 


tion. Whether this was an actual misunder- 
standing, facetiousness, or a clinical symptom is 
unknown. 
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met every day at the same hours, the free hours 
for a given S were usually the same every day; 
therefore the only way a selection could be made 
which would produce bias in the results would 
be to have the same situation or the same se- 
quence appear the same hour of the day all the 
time. There were three experimental periods 
each day: 9:10-11:00, 1:10-3:00, and 3:10-5:00. 
Since there were 12 major groups (3 situations by 
4 sequences; the forms were distributed ran- 
domly within each of these groups), it took four 
days to complete the experiment. To prevent 
a possible systematic selection, no situation ap- 
peared more than twice at any one experimental 
time-of-day, and every situation appeared at 
least once in every experimental time-position. 
No sequence appeared more than once in any 
experimental time-position. 

Because the number of summer-session stu- 
dents is so small (18 summer-session students as 
compared with 102 regular students), no distinc- 
tion was made between them and the regular 
students in the analysis of the data. Because 
every order was not represented in every 
Situation X Sequence X Form cell, Order was 
not retained as a classification in the analysis of 
the data. Every three-way cell (i.e., Situation 
X Sequence X Form) contained five individuals, 
none of whom appeared in more than one cell. 


RESULTS 


This analysis of the results con- 
cerns itself only with the recall 
changes under varying degrees of 
stress.’ Its purpose is to examine the 


7 For detailed tables of data order Document 
2532 from American Documentation Institute, 
1719 N Street, N.W., Washington 6, D. C., 
remitting $0.50 for microfilm (images one in. 
high on standard 35 mm. motion picture film) 
or $0.60 for photocopies (6 X 8 in.) readable 
without optical aid. The data for resumption of 
activities are not presented. None of the main 
effects for resumption was significant. A 
thorough analysis of the pattern of resumption 
choices has not been made, but a cursory exami- 
nation indicates that Ss followed a consistent 
choice-pattern regardless of the tasks involved. 
The non-effectiveness of the resumption pro- 
cedure, however, does not affect the recall results; 
only significant interactions involving Situation 
in the present analysis would affect the main 
results of this paper. 

Although there was no interest in the differ- 
ences between Forms A and B, Form was re- 
tained as a classification to provide a smaller 
estimate of error variance. 
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TABLE I 


Mean Numser oF INCOMPLETED AND OF INCOMPLETED-RECALLED TAsKs 
N per three-way cell = 5) 











Situation I 


Situation II Situation III 





Form A Form B 


Form A 


Form B Form A Form B 





x y 





Sequence 1 
Sequence 2 
Sequence 3 


11.8 | 7.6 
12.0 | 6.2 
13.6 | 7.2 









































x refers to the number of incompleted tasks. 


y refers to the number of incompleted-recalled tasks. 


Situation I is the least stressful situation; Situation III is the most stressful one. 


Sequence 1: 


Task-Resumption-Recall; Sequence 2: Task-Recall-Resumption; Sequence 3: Task-Interpolated 


Activity-Recall. 


situation differences for incompleted- 
recalled activities and for completed- 
recalled activities. Since the number 


of completions was not the same for 
all Ss, it is necessary to partial out 
the effect of the number of comple- 
tions on the two kinds of recall; there- 
fore the data will be subjected to an 


analysis of covariance. Analysis of 
covariance is a method which may be 
used to test the hypothesis that a set 
of adjusted sample means are drawn 
from the same population (with re- 
spect to the mean), where the adjust- 
ment partials out the effects of one 
or more variables.® 

Table I contains the mean number 
of incompleted and of incompleted- 
recalled activities for each Situation 
X Sequence X Form cell; N in each 
cell is five. The reduced variance 
(with the effects of number of com- 
pletions partialled out) of incom- 
pleted-recalled tasks is analyzed in 
Table II. The last two entries in 
the first row of this table indicates 
there is significant variation among 
the adjusted situation means of in- 
completed-recalled activities: F = 
4.38; F at the five-percent point is 


8 For a description of analysis of covariance, 
see Kendall (5), Lindquist (8), and Snedecor (20). 


Forms A and B refer to equivalent forms of tasks. 


3.13. The adjusted means (see 
Table III) are: Situation I, 6.32; 
Situation II, 5.10; Situation III, 5.00. 
The ¢-values for tests of significance 
between pairs of means '® are (Table 
III): (Sit. I)-(Sit. II) = 2.45; P = 
o.o1. (Sit. II)-(Sit. III) = 0.19; P 
= 0.85. (Sit. I)-(Sit. III) = 2.63; 
P = 0.01. 

The variation of adjusted means is 
not significant for either Form or 
Sequence, nor are any of the inter- 
actions significant. In view of the 
findings just presented, it is clear that 
as the degree of stress is increased the 
recall of incompleted activities de- 
creases. This relationship holds when 
Situation I is compared either with 
Situation II or with Situation III, but 
it does not hold for the comparison 
between Situation II and Situation 
Il. 

Table IV contains the mean num- 
ber of completed and of completed- 
recalled activities for each Situation 
X Sequence X Form cell; N in each 
cell is five. The reduced variance 

® The five-percent level of confidence will be 
used throughout to evaluate significance of 
results. 

10 For a description of the standard error of 


the difference between two adjusted means, see 
Kendall (gs, p. 244) and Lindquist (8, p. 195). 
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TABLE II 


Anatysis oF RepucED VARIANCE FOR INCOMPLETED-RECALLED Tasks 
(The effects of number incompleted partialled out) 








Source of 
Variation 


Sum of 
Squares 


Degrees of 
Freedom 


Mean F at 
Square 5% Point 





Situation 
Sequence 

- Form 
Sit. X Seq. 
Sit. X Form 
Seq. X Form 
Ss. XS. X F. 


Error 
Total 


31.70 
14-34 
12.44 
4.67 
3.22 
7.86 
17.57 
256.66 


348.46 








15.85 
7-17 
12.44 
1.17 
1.61 
3-93 
4-39 
3.61 














Situation refers to varying degrees of stress. 


Sequence refers to the order in which recall and 


resumption appeared. Form refers to two forms of equivalent material. 


(with the effects of number of com- 
pletions partialled out) of number of 
completed-recalled tasks is analyzed 
in Table V. With the exception of 
the adjusted interaction between Se- 
quence and Form, none of the effects 
is significant. For Situation, F = 
0.17; F at the five-percent point is 
3.13. The significant interaction, 
however, does not affect the non- 
significant variation among the ad- 
justed situation means of completed- 
recalled activities. It is clear, then, 
that increased stress does not affect 
significantly the recall of completed 
activities. 

Earlier in this paper, two predic- 
tions were made: (1) the recall of 
incompleted activities will decrease 
as the stress is increased; (2) the 
recall of completed activities will 


TABLE III 


RecA oF IncompLetep Tasks: ApjusTeD 
Means For Situations I, II, anv III 
AND THE t-TESTS FOR SIGNIFICANCE 
oF DirFereENces AMonGc THEM 


Situation Situation Situation 
I II Ill 


Adjusted Mean 6.32 5.10 5.00 
Comparison P 

(Sit. I}-(Sit. II) 0.01 

(Sit. I1)}-(Sit. IIT) 0.85 


(Sit. I)-(Sit. III) 0.01 


increase as stress is increased. The 
first prediction was supported by the 
data, but the second one was not. 
Before attempting to collate the 
present findings with those already in 
the literature, it might do well to 
consider the evidence for the existence 
of a threat to self-esteem. Alper 
(1, p. 411f.) used a two-fold criterion 
for increasing stress: the number of 
completions and the number of tasks 
recalled must decrease. The decrease 
of number of completions and of re- 
called tasks were taken as indices of 
disruption, and hence as evidence for 
the existence of stress. Neither of 
these indices seems to be appropriate, 
nor does there seem to be any other 
satisfactory criterion to offer in their 
place. Although Alper found sig- 
nificantly fewer completions in her 
threatening situation as compared 
with her non-threatening one, it is 
likely that the number of completions 
is not a decreasing monotonic func- 
tion of stress. In this study, com- 
parisons of the mean number of 
completions yielded the following 
t-values: (Sit. II)-(Sit. I) = 2.70; 
P=o0.01 (72 degrees of freedom). 
(Sit. I1)-(Sit. III) = 3.23; P = 0.01. 
(Sit. I)-(Sit. III) = 1.15; P = 0.25. 
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TABLE IV 


Mean Numser or CompPLetep AND OF ComPLETED-RECALLED Tasks 
(N per three-way cell = 5) 








Situation I 


Situation II Situation III 





Form A Form B 


Form A 


Form B Form A Form B 








Sequence 1 
Sequence 2 
Sequence 3 



































x refers to the number of completed tasks. 


y refers to the number of completed-recalled tasks. 


Situation I is the least stressful situation; Situation III is the most stressful one. 


Sequence 1: 


Task-Resumption-Recall; Sequence 2: Task-Recall-Resumption; Sequence 3: Task-Interpolated 


Activity-Recall. 


Forms A and B refer to equivalent forms of tasks. 


TABLE V 


Ana.ysis OF Repucep VARIANCE FoR ComPLETED-RECALLED Tasks 
(The effects of number completed partialled out) 








Source of Degrees of 
Variation Freedom 


F at 
5% Point 





Situatien 


Sit. X Seq. 
Sit. X Form 
Seq. X Form 
Ss. x 8. xX F. 

Error 


Total 

















Situation refers to varying degrees of stress. 





Sequence refers to the order in which recall and 


resumption appeared. Form refers to two forms of equivalent material. 


It is clear that there was a greater 
number of completions in Situation 
II than in either Situation I or in 


Situation III. It could be argued 
that, up to a point, as stress increases 
there is an increase in motivation and 
a resulting increase in the number of 
completions. Beyond that point, in- 
creasing stress has a disrupting effect, 
and the number of completions de- 
creases.'! 

1 See Murphy, Murphy, and Newcomb (9, p. 
474{) for summaries of studies which present evi- 
dence that threat to self-esteem (in the form of 
criticism or blame) may provide an incentive for 
increased efficiency. 


Analysis of the total-recall data 
reveals a consistent decrease in the 
adjusted mean recall (number of 
completions partialled out) from Situ- 
ation I to Situation III: F = 5.77; 
F at the five percent point is 3.13. 
This finding is in agreement with 
that of Alper who found a significant 
decrease in total-recall when her stress 
situation was compared with the non- 
stress situation. This criterion, how- 
ever, is not independent of the vari- 
ables under investigation. If recall 
of completed and of incompleted 
tasks both decrease significantly, then 
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it is likely that the total-recall will 
decrease significantly. If recall of 
completed and incompleted tasks both 
increase significantly, then it is likely 
that the total-recall will increase sig- 
nificantly. (Rosenzweig’s data (16), 


for instance, show a non-significant 
increase of total recall; there was no 
change in the recall of incompleted 
activities, and a nearly significant 
increase in the recall of completed 
activities.) If recall of completed 
tasks increases significantly but the 
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pleted tasks was used. Re-analysis 
of their data (4) indicates that this 
combined score often obscures the 
trends of the component scores. In 
the present study recall scores for 
incompleted and completed activities 
(where an adjustment partialling out 
the number of completions was made) 
were treated separately. If the recall 
of completed activities and the recall 
of incompleted activities are treated 
separately, the following results 
appear: 


Recall of incompleted activities as stress increases: 


Rosenzweig: 
Alper: 
Glixman: 


Rosenzweig: 
Alper: 
Glixman: 


recall of incompleted tasks decreases 
significantly, then it is likely that 
there will be no significant change in 
the total-recall. In view of the fact 
that there is no unambiguous external 
measure of the degree of stress, the 
statement that it exists—and that it 
exists in a stated order—must remain 
at the level of an assumption. One 
can look only at what an experimenter 
has done, and accept or reject the 
statement that he has produced a 
threat to self-esteem. The point of 
view taken in this paper is that the 
operations performed in the present 
experiment constituted stressful situ- 
ations of varying degrees. 

Comparing the data of this experi- 
ment with the data of previous experi- 
ments is complicated by the use of 
different measures of recall changes. 
Rosenzweig (16) and Alper (1) studied 
the effects of stress on recall by using 
a recall-ratio (Rosenzweig) or recall- 
difference (Alper) score. In both 
instances a score which combined the 
recall of completed and of incom- 


Non-significant decrease 
Near-significant decrease 
Significant decrease 
Recall of completed activities as stress increases: 
Near-significant increase 
Significant decrease 
Non-significant decrease 


(t = 0.23; P = 0.82) 
(t = 2.04; P = 0.07) 
(F = 4.38; P < 0.05) 


(t = 1.90; P = 0.06) 
(t = 2.84; P = 0.01) 
(F = 0.17; P > 0.05) 


There are two outstanding char- 
acteristics which emerge from this 
summary: First, the present study is 
the first one using an interruption 
technique to yield a significant de- 
crease in the recall of incompleted 
activities as a function of threat to 
self-esteem;'this finding is supported 
by Alper’s near-significant decrease 
for recall of incompleted activities. 
Second, Alper’s finding with respect 
to completed activities is markedly 
unusual; it is totally unexpected in 
the context in which it appeared. 
There seems to be no obvious reason 
for the recall of completed activities 
to decrease as stress increases; i.e., 
there seems to be no obvious reason 
for the tasks on which the Ss were 
successful to be forgotten to a greater 
degree than the tasks on which they 
did not succeed. The discrepancies 
among the results may be reconciled 
if three assumptions are made: 


1. Alper’s ‘completed activities’ did not pro- 
duce ‘feelings of completion.’ Ss in her experi- 
ment were told that there was more than one 
solution for each of the scrambled sentences 
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presented tothem. In order to equate working- 
time on the sentences, Ss were told to try to find 
as many correct solutions as they could. It does 
not seem unreasonable, therefore, to suppose 
that ‘completing’ a task—the criterion for com- 
pletion being one successful solution—did not 
mean completion to S. It is possible that S 
became more involved in a task to which one 
solution was found than in one to which no solu- 
tion was found. Since unscrambling the sen- 
tence is essentially an all-or-none task (i.e., there 
could be no partly satisfactory solution), S might 
have felt that a sentence to which he could not 
find even one satisfactory solution was com- 
pletely beyond his ability, and he could not 
expect himself to give a satisfactory performance. 
On the other hand, “If I can find one satisfactory 
solution, why should I not be able to find the rest 
of them?” The evidence for goal-gradient be- 
havior and personal observation of people taking 
examinations and intelligence tests incline the 
present investigator to accept as reasonable the 
statement that ‘tasting blood’ makes for involve- 
ment which is as least as great as does not being 
able to find any kind of satisfactory solution. 
This assumption may be put to test either by 
using in Alper’s experimental set-up the kind of 
tasks used in this study, or by using Alper’s 
material in an experimental set-up similar to the 
one described in this paper. In the first in- 
stance, the results should be similar to the ones 
found here; in the second instance, the results 
should be similar to the ones found by Alper. 

2. Rosenzweig’s stress situation was not as 
threatening as Alper’s or the ones in this study. 
Rosenzweig’s situation falls below an unspecified 
critical point, and the others fall beyond the 
critical point. 

3- Increased recall of completed activities is a 
more superficial defense mechanism than is de- 
creased recall of incompleted activities, and as 
stress is increased beyond a critical point the in- 
crease in recall will disappear. If the situation 
is threatening enough, a decrease in recall of 
completed activities may result. 


If these three assumptions are ac- 
cepted, the results of Rosenzweig 
(16), Alper (1), and the present study 
may be reconciled. Since Rosen- 
zweig’s stress situation falls below 
the unspecified critical point, there 
should be an increased recall of com- 
pleted activities, but no significant 
change in the recall of incompleted 
activities. This is essentially what 
Rosenzweig found. The present 
stress situations fall beyond the 


293 


critical point; therefore there should 
be no increase in the recall of com- 
pleted activities—possibily a decrease 
—and there should be a decrease in 
the recall of incompleted activities. 
Alper’s stress situation also falls be- 
yond the critical point. If the as- 
sumption that Alper’s ‘completed’ 
tasks were functionally equivalent to 
the incompleted activities in Rosen- 
zweig’s and the present studies, then 
her results should agree with the 
present one. If the necessary as- 
sumptions are accepted this agree- 
ment is present. 

The reasoning underlying the rec- 
onciliation just presented may be 
stated in the form of a multiple hy- 
pothesis: As threat to self-esteem is in- 
creased, two tendencies may be ob- 
served: (1) as stress is increased to a 
critical point, there will be an increase 
in the recall of completed tasks; be- 
yond that point, the increase will 
disappear and there may be a de- 


crease in the recall of completed tasks. 
(2) Somewhere above the point in the 


stress-scale where the increase in 
completed tasks starts, increased 
stress will result in a decreasing recall 
of incompleted activities. This hy- 
pothesis could be put to test by de- 
vising a finer stress continuum than 
now exists; i.e., if a large number of 
test-points were constructed to range 
from the most “neutral” to the most 
stressful (in terms of the operations 
used to define stress as they are de- 
scribed in the literature), curves for 
the recall of completed and incom- 
pleted activities could be plotted as 
a function of threat to self-esteem. 
The hypothesis seems to be warranted 
not only by the existing data but also 
because it does not seem unreasonable 
to suppose that the compensatory 
defense represented by an increased 
emphasis (recall) on completed activi- 
ties will appear under slight stress; it 
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also seems reasonable to suppose that 
it will disappear as the situation be- 
comes more painful to the individual. 
Where the consequences are not dire, 
an individual may be able to prevent 
a self-appraisal of his shortcomings by 
emphasizing his assets.: At some 
point on the stress continuum one 
might expect a stronger, cruder de- 
fense mechanism to become operative, 
so that a decrease in the recall of in- 
completed activities would ensue.” 
As the stress increased ‘it would not be 
worth the individual’s while’ to main- 
tain both kinds of defense, and the 
relatively weaker compensatory in- 
crease in recall of completed tasks 
would disappear. If the threat to 
self-esteem were extreme, all super- 
ficial aspects of the situation—of 
which the names of the activities 
would be one—might be inaccessible 
to recall. 

It is probably more fruitful, at this 
point, to maintain the distinction 
between the forgetting produced in 
this experiment and ‘repression’ as 
defined by clinicians. Whether the 
same set of dynamics is involved in 
the two processes is an empirical 
question. In view of the distinctions 
—with respect to criteria of form and 
content—which can be made at this 
time, it would seem premature to 
equate the two kinds of forgetting. 
The present study provides an eco- 
nomical technique for producing a 


2% The reasoning in this discussion is given 
further support by data reported by Lewis and 
Franklin (7, p. 197f). If changes in recall of 
incompleted and completed tasks are analyzed 
separately (data from 7, Tables I and II, p. 197), 
both a significant decrease in recall of incom- 
pleted tasks and a significant increase in recall 
of completed tasks appear. These results have 
not been discussed in the present paper, because 
they are a minor part of the study by Lewis 
and Franklin. It is of further interest to note 
that these authors state that there is no loss of 
recall of incompleted tasks; the arguments for 
this statement are too lengthy to consider here, 
but the writer does not agree with them. 
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decrement of recall as a consequence 
of stress. These stress situations 
could be used to study the effects of 
stress on other behavioral patterns 
(e.g., rigidity, projection) and the 
interrelationships among the different 
kinds of behavior. It would be rela- 
tively easy to select Ss for clinical 
study on the basis of recall scores 
which are equated for number of 
completions and recall ability (as de- 
termined by some external test). 
Ss with high adjusted recall for in- 
completed tasks could be compared 
with Ss with low adjusted recall with 
respect to a large number of clinical 
variables. If the same set of factors 
are found to be operative in selective 
forgetting as are operative in repres- 
sion, it would be reasonable to talk of 
‘experimentally induced repression.’ 
As things now stand, this phrase is 
misleading and is, perhaps, injurious 
to experimental approaches to per- 
sonality. 


SUMMARY 


The general purposes of the experi- 
ment were tp determine whether (1) 
situations of varying degrees of stress 
(threat to self-esteem) could be struct- 
ured for groups of individuals; (2) 


changes in stress would produce 
changes in recall of incompleted and 
completed tasks; (3) a group tech- 
nique could be used for the study of 
resumption of tasks; (4) changes in 
stress would produce changes in re- 
sumption of tasks. 

Specifically, the experiment was 
designed to test the following pre- 
dictions: 

I. As stress increases, the recall of 
imcompleted activities decreases. 

II. As stress increases, the recall of 
completed activities increases. 

One hundred twenty students from 
the introductory course in Psychology 
at the University of California were 
used; 18 students were in attendance 
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only for the Summer-Session (when 
the experiment was conducted), and 
102 students also attended the regular 
session. Since only the recall data 
were examined in this study, the data 
for only go of the 120 Ss were used. 

The experiment was designed so 
that the data, which were collected 
by a group procedure, could be treated 
by analysis of covariance; the effect 
of the number of completions on recall 
was partialed out. There were three 
classifications: (1) three Situations 
producing three degrees of stress; (2) 
three Sequences which represented 
different arrangements of recall and 
resumption; and (3) two equivalent 
Forms of 20 paper-and-pencil tasks. 
Each three-way cell contained five 
individuals, none of whom appeared 
in more than one cell. 

Prediction I was supported by the 
data; i.e., there was a significant de- 
crement of recall of incompleted as 
stress increased. Prediction II was 
not supported by the data; i.e., there 
was no significant increase in the recall 
of completed activities as stress in- 
creased. Similarities and diffegences 
between these results and the results 
in the literature were pointed out. 
A revised hypothesis was suggested 
which would be consistent with all 
the available data. Suggestions for 
future research were made. 


(Manuscript received June 7, 1948) 
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THE RELATIONSHIP BETWEEN THE STRENGTH OF A 
HABIT AND THE DEGREE OF DRIVE PRESENT 
DURING ACQUISITION? 

BY BRADLEY REYNOLDS 


University of Missourt 


INTRODUCTION 


We will be concerned in this study 
with the relationship between the 
strength of a habit and the degree of 
drive present while the habit is being 
acquired. Habit, as it will be used 
here is a functionally defined term 
equivalent in meaning to Hull’s 
sHpr (6). Drive will refer to the dis- 
position of an organism to engage in a 
certain kind of behavior under a given 
set of circumstances. An _ indirect 


measure of drive based upon an em- 
pirical law relating drive and an ex- 
perimental variable will be employed. 
This law is that describing the rela- 
tionship between the hunger drive 


and food privation for the white rat. 
It will be assumed that the behavior, 
or performance, P, of an organism is 
a multiplicative function of the 
strength of habit, H and the degree 
of drive, D; that is, P= H XD. It 
follows immediately from this as- 
sumption that if training is carried 
out with two groups of subjects 
differing in drive, D, any observed 
difference in performance, P, in favor 
of the more highly motivated group 
can as well be ascribed to differences 
in D as to differences in H, or to 
differences in both H and D. If 
differences in P are to be ascribed to 
differences in H, then it becomes 
necessary to equate D for both groups. 

There are at least four desiderata 
of the design for an experiment aimed 


1 The study reported in this paper was made 
possible through a grant from the Research 
Council of the Graduate School of the University 
of Missouri. 
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towards demonstration of the rela- 
tionship between degree of drive and 
growth of habit strength. 

1. It is required that groups of 
subjects be differentiated with respect 
to degree of drive. If privation is to 
be used as an indirect measure of 
degree of drive, it must be recognized 
that the direction of the difference 
between any two levels of privation 
is not necessarily consistent with the 
direction of the differences between 
the associated drives. 

2. Following the training of groups 
showing different degrees of drive, 
tests for level of habit strength must 
be made with the same level of drive 
for all groups. If drive is being 
manipulated |by means of privation, 
it must be recognized that equation 
of degree of privation does not nec- 
essarily entail equation of degree of 
drive. Residual effects of prior differ- 
ential privation may persist during 
periods of equal privation. 

3. Reinforcement for the habit to 
be learned must be on the same di- 
mension for all groups. Drive can- 
not be considered apart from rein- 
forcement. When it is said that an 
animal is motivated to engage in a 
certain kind of consummatory activ- 
ity, it is also implied that the occur- 
rence of such consummatory activity 
constitutes reinforcement for the re- 
sponse or responses which have led 
up to this activity. If it is said that 
a response is reinforced it is also im- 
plied that it is reinforced in relation- 
ship to some drive. An animal that 
is not hungry may consume food 
under certain sets of circumstances. 
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Such consummatory activity may be 
reinforcing for responses which have 
led on to getting food. Such rein- 
forcement is to be distinguished from 
that occasioned by the consumption 
of food by a hungry animal. The dis- 
tinction is more than a theoretical 
one between primary and secondary 
reinforcement. It is a distinction 
between the effects of the two kinds 
of reinforcement. 

4. It is necessary that the experi- 
mental variables which will be manip- 
ulated indirectly and concomitantly 
with drive do not make for differential 
rates in learning. More highly mo- 
tivated animals can be expected to 
show higher levels of activity, greater 
vigor of response, faster rates of loco- 
motion, and so on. If any of these 
makes for greater learning in the ex- 
perimental situation employed then 
the effects of such a variable would be 
confounded with the effects of drive. 


Finan (4) has reported a study on the 
effect of degree of drive upon the strength 
of a habit. Four groups of rats were 
trained to make a bar-pressing response. 
Times of deprivation for the four groups 
were I, 12, 24 and 48 hours. Thirty rein- 
forcements for the bar-pressing response 
were given and, 48 hours following train- 
ing, all groups were extinguished. At 
the point of extinction all animals had 
been deprived of food for 24 hours. 
Strength of conditioning was measured 
by trials to extinction. The results 
follow: 


Deprivation 

in hours 
Mean trials 

to extinction 


thr. i2hr. 2ghr. 48 hr. 


31.6 62.0 §38 45.6 


It is to be noted that an apparent re- 
lationship exists between degree of 
drive and habit strength. The relation- 
ship is such that the maximum effect is 
somewhere beyond that degree of drive 
associated with one hour of food privation 
and somewhere short of the degree of 
drive associated with 24 hours of priva- 
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tion. Finan reports differences in mean 
rate of extinction between the two inter- 
mediate drive groups and the extreme 
drive groups as significant of real differ- 
ences at confidence levels ranging from 
the five percent to beyond the one per- 
cent level. These results are somewhat 
at variance with what one might have 
anticipated. Assuming a relationship 
to exist, it would seem more plausible 
that increasing drive would be associated 
with increasing strength of conditioning. 
That the results are to be explained on 
the basis of the fact that drive was 
greater with 12 hours privation than 
with 48 seems unlikely. Finan reports 
observations indicating that, in terms of 
level of activity, animals in the 48-hour 
group were more highly motivated than 
those in the 12-hour group. Taking the 
results at face value the more likely 
explanation is that, for the learning task 
employed, habit strength is greater when 
animals have been trained with an 
optimum level of drive, and that that 
optimum is less than the maximum level 
of drive. 

MacDuff (9) carried out an investiga- 
tion designed to study the effect of 
varying degrees of drive at the point of 
training upon retention. The results 
of two experiments with albino rats are 
reported. Both involved the use of 
multiple-T mazes. In one experiment 
animals were given training trials one 
day per week until a learning criterion 
had been attained. A 16-unit maze was 
used. Three groups of albino rats were 
trained following 12, 24 and 48 hours 
deprivation of food. Following training 
all animals were placed on a 24 hour 
feeding schedule for six weeks. At the 
end of six weeks, the animals relearned 
the maze. 

In a second experiment, a 6-unit maze 
was employed. Training was carried 
out with massed trials. Degrees of pri- 
vation were those used in the first experi- 
ment. After two weeks on a 24-hour 
feeding schedule the animals relearned 
the maze. The results from the two ex- 
periments were in the same direction. 
The 48-hour groups showed greatest 
retention and the 12-hour groups showed 





298 


least retention. Although the differ- 
ences between groups were not as stable 
statistically as one would wish, the 
trends were sufficiently marked and con- 
sistent as to make it seem improbable 
that the observed differences are at- 
tributable to sampling errors. The re- 
sults of these experiments are not con- 
sistent with those reported by Finan. 
Evaluation of the two experimental 
designs against our design desiderata 
reveals some possible sources of dis- 
crepancy. 

It does not seem likely that any lack of 
differentiation of groups with respect to 
drive existed in either investigation. 
Nor does it seem likely that the most 
extreme deprivation employed (48 hours) 
was beyond the point where deprivation 
would be expected to lead to decreased 
drive. In the MacDuff experiment the 
average time per trial was least for 
48-hour groups. Performance of 48- 


hour groups during training in all its 
measured aspects save one was superior. 
In the massed trials experiment the 
48-hour groups required the greatest 
number of trials to reach the learning 


criterion. Even this single divergence 
from the main trend is attributed, by 
MacDuff, to higher motivation. 

In the Finan experiment it is possible 
that in the case of the 12-hour group 
privation was greater than the reported 
time of privation would indicate. Prior 
to training, the 12-hour group was on a 
24-hour maintenance schedule. The day 
immediately before training feeding was 
delayed 36 hours. This permitted run- 
ning trials 12 hours following feeding 
without varying the time of day for 
training for groups. If privation were 
to be considered in terms of the 48-hour 
period preceeding training, the 12- and 
24-hour groups could be considered to 
have equal privation, since both had 
equal amounts of food during this 
period. The 12-hour group cannot be 
considered to have had a greater degree 
of privation than the 48-hour group. 

Maintenance of a 24-hour feeding 
schedule for periods of two to six weeks 
following training should have made for 
equal motivation for all animals em- 
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ployed in the MacDuff experiment. 
The second requirement noted above 
would seem to be satisfied. 

In the case of the 12-hour group in the 
Finan study, it has been pointed out 
that on the day before training feeding 
was delayed 36 hours. On the day of 
training the 12-hour group was fed half 
the usual daily ration. Twenty-four 
hours later these animals were fed a 
full ration. On the following day ex- 
tinction trials were given. All other 
groups received either a full ration or, in 
the case of the 48-hour group, a double 
ration, on the day of training. These 
groups were fed full rations on the fol- 
lowing day. If privation is determined 
over a 48-hour period, then at the point 
of extinction, the 12-hour group had the 
greatest degree of privation, the 24-hour 
and the 1-hour groups next greatest, 
and the 48-hour group least. The fall- 
ing gradient of trials to extinction from 
the 12-hour point could then be explained 
on the basis of a falling gradient of 
strength of drive at the point of ex- 
tinction. 

The MacDuff study seems to satisfy 
the requirement of unidimensionality of 
reinforcement. The Finan study may 
not. It is quite possible that the rein- 
forcement for the bar pressing response 
in the case of the 1-hour deprivation 
group was|largely secondary rather than 
primary. During the period when all 
animals were on a 24-hour maintenance 
schedule, the animals were fed on two 
separate days for 30 min. in the experi- 
mental boxes. On a third day, again 
following 24-hours of food deprivation, 
each animal received 40 food pellets 
through the food magazine released by 
the experimenter at a rate determined 
by the rat’s rate of eating. Each de- 
livery was accompanied by the click of 
the magazine vendor. Under this pro- 
cedure we would anticipate that both the 
pellets and click would acquire second- 
ary reinforcing properties (2.3). Ani- 
mals trained following one hour of pri- 
vation could be expected to respond to 
both the apparatus bar and the pellets 
employed in the absence of hunger drive 
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and of primary reinforcement (1).2. A 
further fact to be considered is that 
during extinction trials the click of the 
vendor was absent. For the 1-hour 
group, where secondary reinforcement 
could be assumed to be a more promin- 
ent in the pattern of reinforcement, the 
omission of the click may have had 
greater significance. This is a point 
which would need to be settled by ex- 
perimentation. 

The MacDuff experiment can_ be 
criticised on the basis of failure to ade- 
quately control variables known to be 
relevant in learning. It has been noted 
above that the 48-hour privation groups 
required significantly less time, on the 
average, for training trials than the other 
groups. Since running time in the maze 
is positively related to the delay of rein- 
forcement, the 48-hour groups were 
trained with a shorter delay of reinforce- 
ment than the less highly motivated 
groups. Differences in habit strengths, 
as revealed in retention trials could be 
ascribed then to differences in delay of 
reinforcement rather than differences in 
strength of drive, since it is well estab- 
lished experimentally that decreased 
delay makes for increased learning (7, 
13). 

In Finan’s study, delay of reinforce- 
ment is adequately controlled, since 
food follows immediately after the mak- 
ing of the response to be learned. The 
instrumental conditioned response tech- 
nique seems ideally adapted to investigat- 
ing the problems of relationship between 
habit acquisition and drive for this rea- 
son. The maze experiment seems poorly 
adapted for the same reason. 

Finan did not control the spacing of 
trials and this fact may have made for 
development of a greater degree of 
reactive inhibition® in the case of the 
more highly motivated animals. There 
is not only the question of massing versus 


2It should be remembered, of course, that 
Finan’s 1-hour group had not been satiated. 
The animals in this group were fed 10 grams of 
food and might well be considered to have been 
hungry. 

3 Reactive inhibition as it is employed here 
has the meaning ascribed to it by Hull (6). 


299 


spacing of trials but also the question of 
relative vigor of response. This latter 
factor is known to be relevant in the 
production of inhibitory potential (8, 
10, 12). There was little significant 
difference in total time required for 
learning by all groups. Under the as- 
sumption that there was no difference 
in the distribution of total time by trials 
for the groups, the factors making for 
growth of inhibition being considered 
here would be adequately controlled for 
all groups. The results of the experi- 
ment to be reported here make this 
assumption untenable. 

Finan used trials to extinction as the 
measure of strength of conditioning. 
The use of such a measure involves the 
assumption that there is no interaction 
between growth of inhibition and level 
of drive. The results of the experiment 
to be reported here do not lend support 
to such an assumption. 

There is one aspect of the Finan ex- 
periment which has not yet been touched 
upon. Finan states the weight of 40 
food pellets to be two grams. Each 
pellet, then, weighed 50 milligrams. 
Since the 12-hour group showed evidence 
of greater strength of conditioning than 
the 48-hour group, a relationship be- 
tween degree of drive and appropriate- 
ness of reward suggests itself. For a 
highly motivated animal, a 50 milligram 
pellet may have less significance in rela- 
tionship to need than it has for a less 
highly motivated animal. Following 
this line of reasoning it might be antic- 
ipated that increasing the amount of 
reward would cause a shift of the opti- 
mum drive level away from 12 hours of 
privation towards a higher degree of 
privation. 


PURPOSE 


The experiment reported here was 
designed to answer these questions: 
1. Will a lower level of drive be as- 
sociated with a higher strength of 
conditioning in a situation where 
(a) a different learning task than 
that employed by Finan is used, (b) 
a new colony of animals is sampled, 
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and (c) a different method for con- 
trolling maintenance is used? 

2. Will measurably increasing the 
amount of reinforcement from that 
employed by Finan lead to a greater 
strength of conditioning with more 
highly motivated animals and to a 
lesser: strength of conditioning with 
less highly motivated animals? 

3. Where a trial to trial measure of 
performance in terms of response 
latency is made available will any 
interaction between drive and reactive 
inhibition be manifested? 


PROCEDURE AND APPARATUS 


Apparatus.—The apparatus employed con- 
sisted of a box 20 in. in length, 44 in. in width, 
and 54 in. high (inside dimensions). Five and 
one-fourth in. from one end of the box, a small 
partition was dropped from the top of the box 
to a depth of 1§ in. This left an opening ap- 
proximately 4 in. square. A black, heavy card- 
board door was hinged at the bottom of the 
partition. The door and partition separated a 
small compartment from the rest of the box. 
The end of the box enclosing this compartment 
was set at an angle so that the upper area of the 
enclosed compartment was about 3 in. in length. 
When the door was swung into the compartment 
it could be moved upwards only about 34 in. 
A hollow recess was made in the floor of the com- 
partment } in. deep and 14 in. in diameter, and 
34 in. from the door. The door was counter- 
balanced so that it moved with only a slight 
pressure and remained in any position in which 
it was placed. The box was painted black and 
covered on top with hardware cloth. The ap- 
paratus, then, consisted simply of a straight, 
short alley with a nose-under door at one end, 
permitting access to a food compartment. The 
food receptacle was close enough to the door that 
the animal had access to the food immediately 
following the response of nosing under the door. 

Pre-training.—All animals * were placed on a 
24-hour maintenance schedule several weeks be- 
fore experimentation was to begin. Animals 
were fed in their living cages, in groups of three 
to five, 12 grams of Purina laboratory chow 


* The animals employed were from the animal 
colony maintained by the department of psy- 
chology at the University of Missouri. They 
were male albino rats 90-120 days old. They 
had had 10 tests for susceptibility to audiogenic 
seizure at ages five to seven weeks. 
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checkers. During the pre-training period, the 
animals were continued on a 24 hour schedule 
and fed in the home cage. The pre-training was 
carried out during two consecutive days as 
follows. 

First day. Following 24 hours of deprivation, 
an animal was placed in the apparatus with the 
door open. A 60-milligram pellet was placed in 
the box at the end opposite the food compart- 
ment. He remained in the box for five min. or 
until he had moved down to the food compart- 
ment and had taken and eaten the food. If at 
the end of five min. he had not eaten the food 
pellet, he was removed from the apparatus and 
placed alone in a large metal pail of the type 
used in the laboratory for transporting animals. 
The pail contained wood shavings and a glass 
sponge cup. The food pellet was removed from 
the box and placed in the sponge cup. Fifteen 
to twenty min. later the animal was again placed 
in the apparatus with the door raised and a pellet 
in the food compartment. He was permitted 
three min. in which to respond by moving 
towards the food compartment, taking the pellet 
and eating. If the animal so responded he was 
removed from the apparatus. If he had not 
responded at the end of three min., he was re- 
moved and placed in the carrying pail and the 
pellet was placed in the sponge cup. This pro- 
cedure was continued for three more trials, with 
the same three minute period. If by the fifth 
trial the animal, when placed in the apparatus, 
moved down to the food compartment and took 
and ate the pellet immediately, pre-training was 
continued through a second day. If the animal 
by the fifth trial was not so responding he was 
discarded. 

Second day. On the second day the animal 
was placed in the apparatus with the door down. 
When the animal moved down the alley to the 
food compartment the door was raised slowly by 
the experimenter, permitting access to the food 
compartment. Five to ten min. later he was 
given a second trial in the same manner. On 
the third trial the animal was permitted to 
attempt lifting the door with his nose, being 
assisted if necessary by the experimenter’s par- 
tially raising the door. Some partial assistance 
was sometimes necessary on the fourth trial. 
On the fifth trial the animal was required to lift 
the door without assistance and take and eat the 
food pellet. If he failed to do this he was 
discarded. 

Training.—The animal was placed in the 
apparatus at the end of the alley opposite the 
food compartment. A trial consisted of the 
animal moving towards the door, nosing it up 
and taking and eating the pellet. Trials were 
run without interruption until a total of 25 had 
been completed. The animal was removed from 
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TABLE I 
Mean Numer oF Trials To Extinction 








Mean 


Confidence 
Level 


| M:i — M2 





16 30.44 





16 17.19 











10.99 


—_— i. | _ - 


13.25 2.44 <5%>2% 





the box at the end of each trial. Time for re- 
sponse was measured from the point when the 
animal had been placed in the apparatus and was 
oriented towards the door to the point when he 
nosed the door. Total time required for 25 
trials was recorded. The food pellet used in 
training weighed 160 milligrams. This stood in 
the ratio 3.2/1 to the pellet weight used by 
Finan. Total amount of food used for the train- 
ing trials was four grams. The ratio of this to 
the amount of food employed by Finan was 
4/1.5. 

Extinction.—Extinction trials were run with 
massed trials. Response time was measured 
from the time the animal was placed in the box 
until he responded by nosing up the door. Trials 
were continued until the animal had failed to 
respond during a period of five min. The animal 
was removed from the box after each trial and 
placed again, immediately, in the apparatus. 

Maintenance.—Following pre-training, ani- 
mals were assigned at random to one of two 
groups; a low drive group (LD) and a high drive 
group (HD). The LD group was fed the usual 
daily ration of 12 grams. The HD group was 
fed 3 grams. Animals were fed in individual 
cages. Twenty-four hours later all animals were 
trained. 

Following training, animals were returned to 
the individual cages and 100 grams of food was 
placed in each cage. Twenty-four hours later 
the uneaten portion was removed and weighed. 
Measure of amount eaten was estimated by sub- 
traction of remainder from 100 grams. This 
gave only a relative measure, since some portion 
of the 100 grams was lost, during feeding, 
through the meshes of the false bottom of the 
cage. The animals were given no more food at 
this time. On the two following days the ani- 
mals were fed, in individual cages, the regular 
daily ration of 12 grams. On the sth day follow- 
ing training, extinction trials were given. 


RESULTS 


Mean numbers of trials to extinc- 
tion for both experimental groups are 
presented below in Table I. The LD 


group required a significantly greater 
number of trials to reach the criterion 
of extinction than did the HD group. 
A graphic distribution of trials to 
extinction is given in Fig. 1. Each 
X indicates the level of performance 
of a single animal. The median for 
the distribution of extinction trials 
for the LD group is indicated. It will 
be observed that 14 of the 16 animals 
in the HD group fall below the 
median for the LD group. The 
tendency for the measures for the HD 
group to cluster towards the lower 
end of the distribution is notable.® 
Data for response latencies during 
training trials are presented in graphic 
form in Fig. 2. Latencies (stated in 
units of .o1 sec.) have been converted 
to reciprocals and multiplied by 1000, 
and arithmetic means of values for 
§-trial intervals have been deter- 
mined. Reciprocals are employed in 
the present instance under the as- 
sumption that a difference in value 
between two shorter latencies is 
significant of a greater difference in 
associated habit strengths than the 
same difference between two longer 
latencies. The function y = 1/x, has 
the simple property desired and the 
merit of convenience of application. 


5Since the presence of asymmetry in the 
distribution of the parent population for the HD 
group is suggested by the form of the sample 
distribution the use of the t-test of significance 
may be questioned. The t-test is used under the 
assumption that the distribution of the parent 
population does not depart from the normal 
sufficiently that the distribution of the mean will 
not tend towards normality. 
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Fic. 1. 


Distributions of trials to extinction for groups trained with different degrees 


of drive and extinguished with the same degree of drive 


It will be noted that the curve for 
the HD group begins at a level higher 
than that for the LD group. At the 
point for the third interval the curve 
for the HD group begins to level off, 
falls below the LD curve at the fourth 
interval, and at the fifth interval has 
fallen to a point approximating that 
for the second interval. 

Total time required for extinction 
bears an obvious relationship to 
number of trials to extinction. The 
HD group mean is 21 min. and the 
LD group mean is 29 min. The 
difference between means is significant 
above the 10 and below the 5 percent 
level of confidence. 
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The time required for training for 
the different groups does not vary 
significantly. The HD group mean 
is 16 min. and the LD group mean is 
17 min. 

Food consumption during the 24- 
hour period following training is 
essentially equivalent for both groups. 
The mean for’the HD group is 34 
grams and for the LD group 33 grams. 

Data for latencies during extinction 
trials are not presented here. The 
reasons for this are several. Many 
animals extinguish rapidly and there- 
fore latencies for these are not avail- 
able throughout the extinction series 
for the group. One has a choice of 


j 





Fic. 2. 


Plots of response latencies during training for high drive (HD) and low drive (LD) 


groups, in terms of 1000 times the reciprocal of the latency (7°) expressed in .o1 sec. 
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manufacturing a latency value or of 
considering only latencies for non- 
extinguished animals. Neither ex- 
pedient is quite satisfactory. Fol- 
lowing the first few non-reinforced 
trials some animals exhibit inhibitory 
effects due apparently to emotionality. 
Such animals will characteristically 
show long latency responses followed 
by shorter latency responses when 
some emotional adaptation has taken 
place. The extinction process in such 
an experimental situation as the one 
established for the present study 
shows a tendency to be cyclical. 
During a short series of extinction 
trials successive responses may show 
increasing latencies. There may then 
follow a second series of relatively 
short latency responses. This may 
be ascribed to the growth of internal 
inhibition during a series of short 
latency responses and dissipation of 
internal inhibition during the series 
of long latency responses. 

For the first two extinction trials, 
harmonic means of latencies for the 
HD group are 1.02 and 1.40 sec., and 
for the LD group .97 and 1.05 sec. 
Individual variability is so marked 
that the observed differences cannot 
be taken as significant of any real 
differences. Any significance at- 
tached to observed differences must 
be in relationship to the acquisition 
curves and the subsequent course of 
extinction for the two groups. 


Discussion oF RESULTS 


The results of the study reported 
here are consistent with the results of 


the Finan experiment. Change of 
learning task and method for manip- 
ulation and control of food mainten- 
ance produce no notable changes in 
the gross features of the relationship 
between strength of drive and strength 
of conditioning. Measurably _ in- 
creasing the amount of reinforcement 
does not alter the basic relationship 
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first noted by Finan and again de- 
monstrated in the results of the 
present experiment. It seems highly 
probable that the following general 
statement is true: 

Assuming that reinforcement re- 
mains qualitatively the same (e.g., 
the same primary reinforcement for 
the same primary drive), that an 
instrumental conditioning procedure 
is employed, and that training and 
extinction are effected by the use of 
massed trials, animals trained with a 
lower level of drive will require a 
greater number of trials for extinction 
than animals trained with a higher 
level of drive, extinction trials being 
administered when level of drive has 
been equated for both groups of 
animals. It does not follow nec- 
essarily, from this statement, that a 
greater number of extinction trials is 
indicative of a greater strength for 
the habit which has been acquired. 
Hence it does not follow necessarily 
that, if a group of animals trained 
with a low level of drive requires a 
significantly greater number of trials 
to reach extinction of a response than 
a group trained with a high level of 
drive, then the strength of the habit 
acquired by the low drive group is 
greater than that acquired by the 
high drive group. To draw such an 
inference would require the assump- 
tion that the two response systems are 
equivalent with respect to the opera- 
tion of inhibitory factors. The as- 
sumption is unwarranted in the case 
of the present experiment. There is 
evidence of a relationship between 
drive and inhibition. It does not 
seem possible to specify this rela- 
tionship with any exactness at pres- 
ent. We shall examine certain hy- 
potheses. 

Whenever a response occurs, or, in 
more general terms, whenever a habit 
is exercised, there is established for 
the organism involved, a disposition 
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opposed to the repetition of the re- 
sponse; i.e., inhibition develops. It 
is clear in view of the results of ex- 
perimentation by James (8), Mowrer 
and Jones (10), and Solomon (12), 
that inhibition develops as a function 
of the effort expended in making a 
given response.® It is also a well 
established experimental fact that 
inhibition can become conditioned to 
a stimulus in the same manner that 
any positive response is conditioned 
(tr). Whenever a response is evoked 
and some increment of inhibition 
accrues, some residual of this inhibi- 
tion remains as conditioned inhibi- 
tion. Should a response involve a 
greater degree of effort then a greater 
amount of inhibition will develop 
and a larger residual of conditioned 
inhibition will result. 

When we examine the acquisition 
curves for the HD and LD groups 
we note that there is evidence that 
during the early trials HD animals 


exhibit shorter latency responses than 


do LD animals. This is consistent 
with expectations as to differences 
to be observed between the behavior 
of highly motivated and less highly 
motivated animals. Since a more 
rapid response will make for greater 
effort expended, greater reactive in- 
hibition will result. When reactive 
inhibition becomes sufficiently great 
it reaches the point where its effect 
counteracts the effect of increased 
strength of habit to the extent that a 
decrement in response results. It will 
be observed that the HD curve beings 


®It is not intended that inhibition is to be 
considered solely or always a function of effort 
expended. Inhibition as a general disposition 
towards response blocking is relatable to stimu- 
lus factors, to response factors, to response com- 
petition (interference), and to emotional factors. 
It is possible that certain types of inhibition are 
assumable under other types. The response re- 


lated inhibition under discussion above may 


prove to be reducible to effects of proprioceptive 
stimulation and hence assumable under Pavlov- 
ian internal inhibition. 
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leveling off after the 15th trial and 
begins to fall after the 2oth trial. 
The terminal point on the HD curve 
is at about the same level as the 
point for the 5th to roth trial interval. 

A higher level of reactive inhibition 
at the point of training is associated 
with a higher level of conditioned in- 
hibition. Consequently, at the point 
of extinction for the HD and LD 
groups, the two groups are not equiv- 
alent with respect to inhibitory po- 
tential. The habit acquired by the 
HD group has a greater amount of 
conditioned inhibition integrated 
within it. Since it is assumed that 
reactive inhibition and conditioned 
inhibition combine additively, the 
HD group shows a faster rate of 
extinction than the LD group. 

The hypothesis just advanced is 
based on established empirical rela- 
tionships and generally accepted the- 
oretical principles of learning psy- 
chology. It is readily testable. Sev- 
eral alternate hypotheses can be 
advanced. One of the more attrac- 
tive involves the postulation of a new 
relationship not required under the 
first considered hypothesis. For that 
reason, the hypothesis to be examined 
next may seem to have less merit than 
the first. 

If we assume that increasing the 
degree of drive lowers the threshold for 
any stimulus, but that continued 
presentation of the same stimulus 
raises the threshold for that stimulus, 
the results of the present experiment 
are easily explained. The develop- 
ment is similar to that for the first 
hypothesis, the only important change 
involved is that of shifting emphasis 
from the response aspect of the ac- 
quired habit to the stimulus aspect. 
This second hypothesis possesses a 
greater degree of generality than the 
first. It may also be taken as a 
preliminary formulation for a general 
theory of motivation. In these re- 
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spects it is superior to the first hy- 
pothesis. It is as readily testable. 


CONCLUSION 


It is not possible to say, on the 
basis of experimental evidence now 
available, that strength of drive is or 
is not related to strength of habit. 
The question is not a purely em- 
pirical one. The problem is to a 
large extent involved in the larger 
problem of what meaning is to be 
ascribed to the constructs ‘motiva- 
tion’ and _ ‘reinforcement.’ This 
seems hardly the place to develop this 
aspect of the problem to any greater 
degree than has been done already in 
the introduction to the paper. This 
can be said: if we accept the hypoth- 
esis that habit strength is not a func- 
tion of strength of drive operating at 
the point of acquisition, then nothing 
has been produced by way of experi- 
mental evidence that would necessi- 


tate any abandonment of that hy- 
pothesis. 


SUMMARY 


1. Two groups of 16 albino rats 
each were given 25 reinforcements of 
an instrumental response. One group 
(HD) was trained 24 hours after being 
fed three grams of food, and the other 
group (LD) was trained 24 hours after 
being fed 12 grams of food. Each 
reinforcement involved presentation 
of a food pellet weighing 160 milli- 
grams. 

2. During training trials, HD ani- 
mals exhibited relatively shorter la- 
tencies of response early in the series 
and relatively longer response la- 
tencies late in the series. This trend 
was reversed in the case of LD ani- 
mals. 

3. Following training all animals 
were permitted unlimited feeding for 
24 hours and were fed 12 grams every 
24 hours thenceforth. 

4. On the fifth day following train- 
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ing, all animals were given extinction 
trials under the same degree of food 
privation. The LD group required a 
significantly greater number of trials 
for extinction to a criterion than did 
‘the HD group. 

5. It has been contended that 
differences in rate of extinction can be 
ascribed to differences in amount of 
conditioned inhibition incorporated 
into the habits learned by the two 
groups. 


(Manuscript received July 21, 1948) 
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A STUDY OF MOTIVATING CONDITIONS NECESSARY 
FOR SECONDARY REINFORCEMENT! 


BY WILLIAM K. ESTES 


Indiana University 


To what extent does conditioning 
by secondary reinforcement depend 
upon concurrent motivating condi- 
tions? A partial answer has been 
provided by experiments showing that 
effectiveness of a secondary rein- 
forcer, or secondary reward, is not 
dependent upon continued presence 
of the original drive (1). An audi- 
tory stimulus which has accompanied 
the presentation of water to thirsty 
rats will subsequently exert a sig- 
nificant reinforcing effect upon a new 
conditioned response when the ani- 
mals are hungry but not thirsty.? It 
is not clear from these data, however, 
whether the second drive plays any 
essential role in the generalization. 
Will frequency of a conditioned re- 
sponse increase under the influence of 
secondary reinforcement if all of the 
primary drives are reduced to minimal 
levels, or is the presence of some 
source of primary motivation a nec- 
essary condition for response evoca- 
tion? Only in the former case would 
it be necessary to introduce a concept 
of secondary motivation to account 
for the effects of secondary rein- 
forcers. 

The present study has been de- 
signed to obtain evidence on this 
question by examining the dependence 
of generalized secondary reinforce- 
ment upon motivational level. First, 
an originally neutral auditory cue is 
established as a secondary reinforcer 


1 This paper was presented in part at the fall, 
1948, meetings of the American Psychological 
Association. 

2 Schoenfeld (6) has obtained a similar trans- 
fer of secondary reinforcement in the reverse 
direction, i.e., from hunger to thirst. 


by making it the occasion for pres- 
entation of water when the animals 
are thirsty. Then the effectiveness 
of the auditory cue in strengthening a 
new response is tested under two con- 
ditions: (1) the animals have been 
satiated on water but deprived of 
food for 23 hours; (2) hunger, thirst, 
and ‘exploratory’ drives have all been 
reduced to low levels. 


PROCEDURE 


The apparatus, which consists of four Skinner- 
type conditioning units, has been described in 
detail elsewhere (2). Each rat was enclosed 
during the experimental period in a partially 
sound-proofed box containing a magazine which 
could be set to deliver small quantities (five 
mgm.) of water. The auditory stimulus used as 
a secondary reinforcer was the buzz of the maga- 
zine motor, a rather loud complex sound which 
preceded the delivery of water for 4 sec. when- 
ever the magazin¢ was operated during the train- 
ing periods. A small horizontal aluminum bar 
could be made available to the animal through a 
slot in the wall of the experimental cage. De- 
pression of the bar a distance of 3 in. with a 
force of 44 gm. actuated a rotary microswitch, 
permitting electrical impulses to go to the re- 
corder and, when the procedure called for rein- 
forcement, to the magazine motor. To provide 
secondary reinforcement alone, the water reser- 
voir was removed from the magazine, and elec- 
trical connections set so that each depression of 
the bar was followed for $ sec. by the sound of 
the magazine motor. Whenever the bar was 
present in the animal’s compartment, a cumu- 
lative curve of bar-pressing responses vs. time 
was recorded. 

Subjects were 12 male albino rats obtained 
from an unselected stock maintained by a local 
commercial vendor. They were all experimen- 
tally naive. The animals were housed in small 
individual cages between experimental sessions. 
No control of temperature or humidity was pos- 
sible. The temperature in the laboratory at the 
time of the pre-test was 20 degrees C., at the 
time of the test session, 27 degrees C. 

The design of the experiment is summarized 
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in Table I. Prior to the first experimental 
period, the animals had water and dry food avail- 
able at all times. The pre-test was conducted 
to determine the initial unconditioned rate of 
bar-pressing. During this period, the bar was 
present continuously in the experimental com- 
partment for 30 min., and all responses were 
recorded; no reinforcement of any kind was 
given. After this period, the rats were divided 
into three groups of four animals each, matched 
for conditioning unit and so far as possible for 
rate of responding on the pre-test. Where the 
latter could not be accomplished, animals with 
the higher unconditioned rates were assigned 
to Group III. 

During the five days of training, the bars 
were removed from the experimental compart- 
ments. All animals were given access to water 
for one hour immediately after the daily experi- 
mental period; dry food was available in the liv- 
ing cages at all times. Groups I and II received 
the following training: On the first day, water 
was present in the magazines at the beginning of 
the period; all of the animals discovered the 
water and drank it. On the second day, the 
magazines were operated approximately once 
per min. until each rat had responded to the 
sound of the motor five times by approaching the 
magazine and drinking the water. On the third 
and fourth days each rat was subjected to 20 
operations of the magazine, and on the fifth day, 
to 10 operations. All of the animals responded 
to each occurrence of the magazine sound during 
the last three periods, thus receiving a total of 
55 reinforcements. On each of the five days, 
the animals of Group III, which were to serve 
as controls, were placed in the conditioning boxes 
for 30 min.; they did not receive water in the 
apparatus at any time. 

At the time of the test, subsequent to this 
series of training sessions, motivating conditions 
were as follows. All three groups had water con- 
tinuously available for the preceding 23 hours. 
During the test period, flat dishes of water were 
present in the experimental compartments. 
Since these dishes are much easier to drink from 
than the water bottles attached to the living 
cages, it is probable that the thirst drive was 
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effectively satiated. Groups I and III had been 
deprived of food for 23 hours; Group II had been 
deprived of food for six hours. Since the experi- 
ment was run in the afternoon, and the tempera- 
ture in the laboratory was relatively high (27 
degrees C.), it is likely that the ‘activity drive’ 
was at a very low level for all groups. During 
the 30-min. test session, the bars were present 
continuously in the experimental compartments, 
and all bar-pressing responses were recorded. 
In the case of Groups I and II, electrical connec- 
tions were set so that each bar-depression was 
followed by the sound of the magazine motor. 


RESULTS 


The number of bar-pressing re- 
sponses evoked from each animal 
during the pre-test and test periods is 
given in Table II. Student’s #-tests 
for the various group comparisons are 
summarized in Table III. It will be 
noted that the differences between 
groups on the pre-test do not ap- 
proach significance. The results for 
Group III (control) indicate that the 
effect of the pre-test period was to 
adapt out most of the animals’ uncon- 
ditioned exploratory behavior with 
respect to the bars, the decline in rate 
of bar-pressing yielding a value of t 
with a probability of only .o6 in a 
random sampling distribution. 
Group II (low drive) exhibits a simi- 
lar decline in rate of responding from 
pre-test to test; evidently secondary 
reinforcement had no effect on the 
behavior of these animals. In the 
case of Group I (high drive), second- 
ary reinforcement apparently exerted 
a small but measurable effect. All 
four animals in this group yield an 


TABLE I 


Summary oF ExperRIMENTAL PROCEDURES 








Group 


Pre-test 


Training Test Conditions 





I. High Drive 
II. Low Drive 
III. Control 








55 (S—W) 
55 (S—W) 
None 


23 Hr. Hunger; R—S 
6 Hr. Hunger; R—S 
23 Hr. Hunger; R— 











Legend: S = Sound of magazine; W = Water; R = Bar-pressing response. 





WILLIAM K. ESTES 


TABLE II 


NumBer oF Bar-Pressinc Responses Evoxep From Eacu Susyject 
DURING Pre-Trst anp Test PeERiops 








Group I 


Group II 


Group III 





High Drive 


Low Drive 


Control 





Pre-test 


Pre-test 


Pre-test 





Mean 











12 














Mean gain 











TABLE III 


StupeEnt’s t-Tests ror DirrERENCES BETWEEN Groups ON Pre-TEst AND Test PERiops 
AND FOR DiFFERENCES IN MEAN GaIN FROM Pre-Test To Test Periop 
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Pre-test 
Test 
Gain 

















increase in responding on the test 
period, and a comparison of mean 
change from pre-trest to test with the 
comparable statistics of Groups II 
and III yields values of ¢ which ex- 
ceed the two percent level of con- 
fidence. These estimates of statis- 
tical reliability are probably conserva- 
tive, since the animals of Group I had 
the lowest mean activity level, as 
determined by the pre-test. The 
rate of responding of Group I on the 
test period was considerably lower 
than that of groups with similar con- 
ditioning histories tested under the 
same degree of hunger in other experi- 
ments (1). The difference is prob- 
ably to be accounted for largely in 
terms of the pre-test period of the 


present study, which resulted in 
habituation of exploratory responses. 


Discussion 


Taking the results of this study in 
conjunction with those of a previous 
investigation (1), we are in a position 
to formulate a tentative statement of 
the dependence of secondary reinforce- 
ment upon primary motivating con- 


ditions. Firstly, the effectiveness of 
a secondary reinforcer in strengthen- 
ing new responses is not specific to 
the original motivating conditions, at 
least in the case of the thirst and 
hunger drives. Secondly, it appears 
that the strength of a response, as 
measured by rate of occurrence, is 
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increased by secondary reinforcement 
only if some organic drive (e.g., food 
deprivation or water deprivation) is 
strong enough to instigate activity. 
There is no evidence, from these 
studies, to suggest that an organism 
will work for secondary reinforcement 
in the absence of any ‘primary’ mo- 
tivation. This conclusion appears to 
be in harmony with those of Koch 
and Daniel (4) and Wolin (8), working 
with the rat and the pigeon, respec- 
tively, who report virtually zero rates 
of conditioned responses (previously 
reinforced with food) when the ani- 
mals are thoroughly satiated on both 
food and water. 

Recent investigations concerning 
the relation of motivation to strength 
of conditioning evidently require some 
sort of revision of the traditional 
formulations of law of effect or in- 
strumental conditioning. It has been 
shown by Estes (1), Miller (5), Webb 
(7), and Wolin (8) that a conditioned 
response reinforced under one drive 
will be evoked by the same conditioned 
stimuli under other drives (for ex- 
ample, a bar-pressing response con- 
ditioned by water-reinforcement will 
occur in considerable strength when 
the animal is hungry but not thirsty). 
The present series of experiments by 
the writer has demonstrated a similar 
generalization of the effects of second- 
ary reinforcement between the hunger 
and thirst drives. Are these two sets 
of transfer phenomena to be con- 
sidered simply as two different em- 
pirical relationships, or may one be 
derived, or ‘explained,’ in terms of the 
other? Some related evidence ap- 
pears to favor the latter possibility. 
In all studies of secondary reinforce- 
ment, the secondary reinforcing cue 
is known to be also a conditioned 
stimulus for some specific response. 
In the present experiment, for ex- 
ample, the original training estab- 


309 


lishes the sound of the magazine 
motor as a conditioned stimulus for 
the response of approaching the 
magazine. It has been suggested by 
Hull (3) that a secondary stimulus will 
exert a reinforcing effect on new re- 
sponses only so long as it concurrently 
evokes its own conditioned response. 
In terms of the present study, the 
sound of the motor would be effective 
in reinforcing bar-pressing only so 
long as each presentation of the sound 
elicited the response of approach to 
the magazine. Qualitative observa- 
tions made by the writer (1) lend some 
support to this generalization. Even 
‘transfer’ animals, which are tested 
while hungry after having experienced 
combined presentations of sound and 
water while thirsty, will respond by 
approaching the magazine when the 
sound is administered as _ reinforce- 
ment for bar-pressing during a test 
period. From this point of view, 
transfer of secondary reinforcement 
would be regarded as mediated by 
transfer of the tendency for the sec- 
ondary cue to elicit its own condi- 
tioned response. It does not seem 
possible to decide on the basis of 
presently available evidence whether 
the critical variable in conditioning 
by secondary reinforcement is simply 
the occurrence of the response to the 
secondary cue or the resulting change 
in the pattern of stimulation imping- 
ing upon the organism. 


SUMMARY 


Previous experiments have shown 
that an originally neutral stimulus 
which has been associated with the 
presentation of water to thirsty ani- 
mals will subsequently exert a rein- 
forcing effect upon responses elicited 
when the animals are hungry but not 
thirsty. The present experiment was 
designed to verity that finding and to 





310 


determine whether the presence of a 
strong hunger drive on the test period 
is a necessary condition for the trans- 
fer of secondary reinforcement. 

Twelve albino rats were first pre- 
tested for rate of unconditioned bar- 
pressing. Next, the two experimental 
groups of four rats were subjected to 
repeated presentations of small quan- 
tities of water accompanied by a 
characteristic auditory stimulus under 
conditions of 23-hour thirst motiva- 
tions. Four control rats did not 
receive this training. 

On the test period, motivating con- 
ditions were as follows: control group 
and high-drive group: deprived of 
food for 23 hours, satiated on water; 
low-drive group: deprived of food for 
six hours, satiated on water. During 
the test, bar-pressing responses pro- 
duced the auditory stimulus previ- 
ously associated with water-reinforce- 
ment, but no other reinforcement. 
Rate of responding increased sig- 


nificantly over the pre-test rate for 
the high-drive group, but decreased 
for the other two groups. 

It is concluded that a secondary 
reinforcing cue will be effective in 


strengthening new responses when 
the original drive has been eliminated 
by satiation, provided that some 
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other source of motivation is present 
in strong enough degree to instigate 
activity. Presently available  evi- 
dence does not require the introduc- 
tion of a concept of secondary motiva- 
tion to account for conditioning by 
secondary reinforcement. 


(Manuscript received July 26, 1948) 


REFERENCES 


. Estes, W. K. Generalization of secondary 
reinforcement from the primary drive. 
J. comp. physiol. Psychol. (In press) 

. Guttman, N., & Estes, W. K. A modified 
apparatus for the study of operant be- 
havior in the rat. J. gen. Psychol. (In 
press) 

. Hurt, C. L. Principles of behavior. New 
York: D. Appleton-Century Co., 1943. 

. Kocn, S., & Danitet, W. J. The effect of 
satiation on the behavior mediated by a 
habit of maximum strength. J. exp. 
Psychol., 1945, 35, 167-187. 

. Mituer, N. E. Theory and experiment re- 
lating psychoanalytic displacement to 
stimulus-response generalization. J. ab- 
norm. soc. Psychol., 1948, 43, 155-178. 

. SCHOENFELD, W. N. Personal communica- 
tion. 

. Wess, W. B. The role of an irrelevant drive 
in response evocation in the white rat. 
Amer. Psychologist, 1947, 2, 303. 

. Woutn, B. R. Generalization of a condi- 
tioned, response between hunger and 
thirst drives in the pigeon. Unpublished 
M.A. thesis, Indiana Univ., 1948. 





LOUDNESS OF SPEAKING: THE EFFECT OF HEARD 
STIMULI ON SPOKEN RESPONSES! 


BY JOHN W. BLACK 


Kenyon College 


Intensity of voice contributes im- 
portantly to intelligibility irrespective 
of the speech situation. Training 
literature for instruction in voice 
communication in the services ack- 
nowledges this fact and advises the 
student to speak loudly, for example 
‘just short of shouting’ when talking 
over aircraft radio and interphone. 
The object is merely that the speaker 
make himself understood. The pos- 
sibility arises, however, that as a 
consequence of heard intensity the 
listener may reply correspondingly 
either loudly or softly. Three studies 
reported here examine the effect of 
the speaker’s signal strength upon the 
intensity of the listener-speaker’s re- 
sponses. 


APPARATUS AND GENERAL 
METHODOLOGY 


The general method for the studies was to 
present recorded stimuli to Ss individually. 
They heard the items through headphones, at 
controlled levels of intensity and responded 
orally. The intensities of the responses were 
measured. 

The stimuli were of two kinds: words and 
questions. The words were five 12-word lists 
from a standard intelligibility test—equated for 
intelligibility under conditions of low signal-to- 
noise ratios. The items were recorded by a male 
voice with as nearly equal intensity as the 
speaker was able to maintain while monitoring 
his speech with a vacuum-tube voltmeter con- 
nected to indicate the recording level. Words 
within a list were spoken at five-sec. intervals. 
Between the 12-word lists a 1000-cycle tone was 
recorded (20 sec.) at a level 20 db below voice 
peaks. Similarly, this voice read five lists of 
sentence-questions. Atleast three peaks in each 
sentence were as intense as the peak values of 


1 Research under Navy Contract N7onr-411: 
Project 20-K-2 with the U. S. N. Special Devices 
Center and Kenyon College. 


the items of the recorded word lists. The inter- 
rogatory sentences were selected from intelligi- 
bility tests and low-level intelligence tests, and 
adapted to permit obvious one-word responses. 
The principle difference between the two types of 
stimuli-other than the duration of items—was 
that one called for repetition and the other for 
invention of answers. 

A calibrated microphone rested eight in. in 
front of S’s lips. It activated a General Radio 
Sound Level Meter (slow response). The play- 
back equipment included a high-fidelity pickup, 
25-watt amplifier, a calibrated attenuating pad, 
and insert-type dynamic earphones. 

The level of playback was set by monitoring 
the recorded tone and adjusting the attenuating 
pad. The amplified signals were attenuated, o, 
25, 45, 65, and 85 db—values established by five 
Os as appearing to divide the range of loudness 
at the headphones equally. The lowest level 
was approximately minimal for reception of the 
stimuli (no errors except with voiceless con- 
sonants) and the highest level approached pain 
at S’s ears. S sat alone in a room adjacent to 
the one.in which E operated the phonograph 
play-back equipment and monitored the meter. 

S was given three sets of instructions. 


[Verbal] Do not touch the headset after 
it has been adjusted. Do not change position. 
Make sure you are comfortable before I leave 
the room. Say the word that you think you 
hear. Talk naturally. You will hear further 
instructions through the headphones. 

[Visual] The monitor will inform you 
when this test is completed. Do not move 
your head or body. Keep your eyes open. 
Say the word that you hear. 

[Recorded] You will hear 60 words 
broken into lists of 12 words each. The lists 
are separated by a constant tone like this 
(tone). All of the words in the lists are fa- 
miliar ones. Immediately upon hearing a 
word, please say it. We shall now practice. 
Remember, as soon as you hear a word, repeat 
“Heel”... EO ss SO: ee. 
You will now hear the 60 words. 


For the sentence stimuli appropriate changes 
were made in the directions. 


[Recorded] You will hear 60 sentences 
broken into “\~ lists of 12 sentences each. 
Each list is s ated from others by a con- 
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stant tone like this (tone). Each sentence is 
in the form of a question or request, and calls 
for a simple, one-word response. For ex- 
ample, you might hear, “What color is most 
paper?” You would say white. Or, “Who 
was the first President of the United States?” 
You would say Washington. We shall now 
practice. Remember, as soon as you hear a 
sentence give the appropriate one-word answer: 
“The first number after eight is what?” .. . 
“What does a drinking fountain dispense?” 
. » » That’s right. Remember, make a one- 
word response to each sentence. 


Order of items was not varied. The serial 
order of conditions was rotated. The Ss were 
male college students, 25 for Experiments 1-2 
and 16 for Experiment 3. 


RESULTS 


Experiment 1 set the pattern for 
the series of three studies, both in 
method and_ results. Successive 
means of the intensity levels of re- 
sponses increased progressively as the 
stimuli became more intense except 
in the instance of the two softest 
levels. These results are summarized 
in Table I (column A) and, graphic- 
ally, in the top line of Fig. 1.? 

An analysis of variance was made 
of the data, and the F-ratio was 
highly significant in each of the three 
studies. This statistic was computed 
by dividing intensity (conditions) vari- 
ance by the remainder (intensity X 
subjects) variance. The significance 
of the differences between the means 
of responses to successive levels of 
intensity of stimuli was tested. The 
t’s, computed from distributions of 
differences, increased in magnitude 
with increases in the intensity of the 
stimuli. And, importantly, incre- 
ments in voice intensity in oral re- 


2For detailed tables of means, SD’s, and 
analyses of variance from the data reported in 
this paper order Document 2632 from American 
Documentation Institute, 1719 N Street, N.W., 
Washington 6, D.C., remitting $0.50 for micro- 
film (images one in. high on standard 35 mm. 
motion picture film) or $0.90 for photocopies 
(6 X 8 in.) readable without optical aid. 
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sponses, presumably induced by the 
strength of the heard stimuli, were 
disproportionately greater as the 
stimuli became very intense. Suc- 
cessive differences between the four 
means that were significantly differ- 
entiated were respectively 1.02, 3.32, 
and 5.60 db. 

The means of the successive re- 
sponses were tested for linearity. F 
was highly significant and thus the 
probability of linearity was not estab- 
lished. The possibility is self evident 
in viewing the curves in Fig. 1 that 
different phenomena may occur as Ss 
respond to stimuli of different intensi- 
ties. For example, in response to 
low-level signals the tendency is not to 
increase response level as stimulus 
level is raised. This is a conservative 
summary in view of the direction of 
the three curves between Conditions 
1-2 (—85 and —60 db) in the figure. 
Measurements employing more care- 
fully determined threshold values and 
less gross increments in signal inten- 
sity may establish a significant decre- 
ment in the response level with small 
increases in the intensity of the heard 
signal near the threshold of hearing. 
Above the 60-db level, the means of 
responses increased as the intensity of 
the stimuli was raised. The possi- 
bility of a dichotomous population of 
means prompted a test of linearity for 
the means of the responses to the four 
highest intensities. In this analysis, 
F exceeded the five percent level of 
confidence, discounting the probabil- 
ity that this stimulus-response re- 
gression was linear. 

Table I (column B) summarizes the 
intensity of the Ss’ one-word re- 
sponses to questions that were heard 
at the five intensity levels (Experi- 
ment 2). The means of these re- 
sponses are plotted as the middle line 
in Fig. 1. Amplification of the signal 
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INTENSITY OF STIMULI IN 08 


Means of intensity of oral responses of listener-speakers as a function 


of the strength of the heard signal 


was the same as with words, and the 
same Ss served in both experiments. 
It is difficult to quantify the over-all 
intensity of sentences. However, in 
addition to the fact that each sentence 
had at least three peaks that were as 
intense as the corresponding word item 
in Experiment 1, the same speaker 
read both sets of stimuli for the record- 
ing and took precautions to minimize 
variations in vocal loudness. The 


respective means of the responses to 
the successive intensity levels of the 
sentences and words are readily com- 
pared in Table I. 

Obviously, the general patterns of 
the responses are similar; this is ap- 
parent also in the two relevant curves 
in the figure. The difference between 
the mean responses to the two least 
intense lists of questions (Conditions 
I-2) was not significant. (Likewise, 


TABLE I 


Mean Intensity 1n db (Generat Rapio Sounp Levet Meter) or Orat RESPONSES 
To Five Levets or Intensity or Stmutt 


A. Repetitions: stimuli, words, 25 Ss. 
C. Repetitions: Stimuli, words, 16 Ss. 





B. Answering questions: stimuli, sentences, 25 Ss. 








Intensity of Responses 





Intensity Conditions of Stimuli 


A 
Mean (SD) 


B 
Mean (SD) 


Cc 
Mean (SD) 





1. Minimal for understanding single words 

2. Condition 1 plus 20 db 

3. Condition 2 plus 20 db 

4- Condition 3 plus 20 db 

5. Condition 4 plus 25 db 

F (intensity variance/intensity X subjects 
variance) 





74.02(4.67) 
73-14(3.73)* 
LD pe eer 
78.38(4.40)* 
83.98(4.95) 
105.18 < 1% 


71.26(4.64) 
70.94(4.43)* 
73-26(4.29)* 
75-78(4.95)* 
81.14(5.71) 
86.67 < 1% 








71.25(6.11) 
70.56(6.39) 
71.31(5.68) 
72.25(5.76)* 
75.56(5-49) 
13.66 < 1% 





* Significantly different from the mean immediately below. 


t< 1%. 


From distributions of differences, 
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these conditions were not significantly 
different in Experiment 1.) All other 
means of responses to successive 
stimulus treatments were highly sig- 
nificantly differentiated from each 
other, exceeding the one percent level 
of confidence in both experiments. 
The comparison of the mean re- 
sponses in repeating single words and 
in giving one-word answers to ques- 
tions is interesting. The means were 
approximately 2-3 db lower when the 
Ss invented the answers than when 
they said-back the words. In order 
to test the relationships of the data of 
Experiments 1-2 the results were con- 
sidered together and a single analysis 
of variance was applied to the arrays 
of data. The ratio of the interaction 
variances for intensity X conditions 
and for intensity X conditions X sub- 
jects was not significant (F). This 


justifies the assumption that the two 
arrays of data are the ‘same’ in that 
they represent a single trend (not 


necessarily zero difference). There 
being no over-all interaction for the 
two components under test in the 
analysis, intensity and conditions 
(words and sentences), each was com- 
pared with the appropriate first-order 
interaction variance, respectively in- 
tensity X subjects and conditions X 
subjects. Significance of the main 
effects, established earlier, was thus 
corroborated: F (intensity), 140.25 
(1% = 3.51; 4 and 96 d.f.). How- 
ever, the point of the analysis rested 
upon testing the hypothesis that no 
real difference existed between the 
means of the responses to words and 
questions. This was tested by the 
ratio: variance (conditions)/variance 
(conditions X subjects). F was highly 
significant, 17.74 (1% = 7.82). The 
hypothesis was rejected and the 
probable independence of the two 
conditions established. This analysis 
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did not show that each pair of means 
for the two conditions was dissimilar. 
The ¢#’s were computed (based upon 
distributions of differences) between 
corresponding pairs of word and 
sentence means at each level of 
intensity. 


Condition 1 
Condition 2 
Condition 3 
Condition 4 
Condition 5 


Since linearity of the means in 
Experiment I was not established 
and the means of the two experiments 
were found to be the same in trend, a 
test for linearity of the five conditions 
in Experiment 2 was not indicated. 
A test for linearity of regression was 
made, however, for Conditions 2-5 as 
in Experiment 1. This resulted in an 
F of 3.45, (5% = 3-13) exceeding 
significance ,and making linearity 
among these conditions improbable. 

Clearly, (1) whether repeating 
words or answering questions, Ss 
responded with increased intensity to 
more intense stimuli, (2) mean speak- 
ing performances represented by the 
upper two lines of the figure were 
different from each other in intensity 
at each of the five comparison levels, 
although (3) the trends of the suc- 
cessive mean responses to the two 
conditions, words and sentences, were 
the same. 

In subsequent interviews, Ss ex- 
pressed confidence that they had not 
knowingly imitated the loudness of 
the stimuli when they made their 
responses. To test whether the fac- 
tors that led to an increase in intensity 
of response in keeping with greater 
intensity of the stimulus were beyond 
the control of the Ss, a third study, 
similar in plan to the earlier ones, 
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was conducted. The stimulus ma- 
terials of Experiment 1 was used. 
Ss were told: 


Earlier in taking this test you or your 
fellow students spoke softly when the volume 
from the record was low, and loudly when it 
was turned up. We want to know whether it 
is possible for you to speak with the same loud- 
ness irrespective of what you hear. Whatever 
you hear from the record do not change your 
loudness.* 


Table I (column C) summarizes 
the results, and the bottom line of 
Fig. 1 shows them in relations to the 
results of Experiment 1-2. A tend- 
ency to talk with greater intensity in 
keeping with increased intensity of 
stimuli persisted in spite of the Ss’ 
efforts to talk with a single constant 
level. Ss were successful in main- 
taining constant intensity only when 
responding to weak or medium 
strength signals. Differences be- 


tween successive means for the first 
four conditions were not significant. 


However, differences between Condi- 
tions 2-4 were significant and be- 
tween 4-5, highly significant. 

Also, when the results from the 16 
Ss of this experiment are compared 
with previous group-resonses to the 
same stimuli (words), it is noted that 
the listener-speakers responded with 
less intensity when they attempted 
to talk with a constant loudness. In 
response to the soft stimuli the ‘con- 

3 Of the 16 Ss, six had taken part in Experi- 
ments 1-2. An analysis of variance was made to 
find whether the experienced and naive Ss consti- 


tuted a single population. The analysis indi- 
cated a single population, F = 2.57 (5% = 7.71). 
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trolled’ intensities approximated the 
‘natural’ responses to questions— 
significantly lower than ‘natural’ re- 
sponses to words. As the intensity 
levels of the stimuli were increased, 
the disparity between the intensities 
of ‘controlled’ and ‘natural’ responses 
increased. 

An earlier analysis established that 
the means of Experiments 1-2 were 
the same in trend although different in 
value. A similar comparison was 
made between Experiments 2-3. The 
ratio between the interaction variance, 
and the error variance [total sums 
of squares of the two experiments/ 
(Nezp.1 + Nezp.2 —2) (k-1) ] exceeded 
the level necessary for significance, 
indicating that the two lower curves 
in the figure represent different popu- 
lations of means. 


SUMMARY 


The findings of these three studies 
emphasize the tendency of Ss to talk 
with different intensities in keeping 
with the level of intensity of heard 
stimulus materials. The trends in 
this regard were the same whether 
the stimuli were words that were to 
be repeated or questions to be an- 
swered. Repeated words were spoken 
more intensely than were answers to 
questions that were heard under the 
same conditions. Finally, it was not 
possible for the Ss to ‘say back’ words 
at a single level of intensity when 
they were heard at different levels. 


(Manuscript received July 19, 1948) 





GENERALIZATION OF A REFERENCE SCALE 
FOR JUDGING PITCH 


BY DONALD M. JOHNSON 


University of Minnesota, Duluth Branch 


This paper reports the third in a 
series of experiments testing a theory 
of the formation of a reference scale. 
The theory is briefly recapitulated, 
then an experiment is described, the 
data from which agree closely with 
the values predicted by the theory. 

It is generally agreed that to predict 
a person’s judgment, e.g., of the pitch 
of a sound, we must know not only 
the effect of the sound on the receptor 
apparatus but also the scale to which 
the sound is referred. Since the 
importance of the reference scale is 
recognized both in psychophysics and 
social psychology (where the term 
‘frame of reference’ is more often 
heard), a theoretical treatment of its 
formation seems desirable. The ex- 


tension of mathematical theory, at 
present applied chiefly to simple or 
low-level behavior, to the field of 
judgment, even if only in a modest 
way, should be rewarding. 

The scale can be described with as 


much precision as we may wish. We 
can present our S with a series of 
objects, in this case sounds, one at a 
time, by the method known as 
‘absolute judgment’ or ‘judgment of 
single stimuli,’ asking him to judge 
each ‘high’ or ‘low.’ From a tabula- 
tion of his judgments we can calculate 
a limen or boundary between his 
categories ‘high’ and ‘low.’ This 
statistically determined category li- 
men, describing the subjective mid- 
point of the scale, is the fact which our 
theory must account for. 

The scale our S uses in judging pitch 
is certainly not innate; he must acquire 
it. And it is reasonable to assume 
that he learns it from his experience 


with the series of stimuli that we ask 
him to judge. The scale can be con- 
sidered a concept, a generalization 
from many particular experiences with 
the sounds. Knowing these particu- 
lar sounds that he has heard during 
the experiment we should be able to 
predict where he will establish the 
midpoint or category limen of his 
scale. 

The effect of any given stimulus, 
x, on the perceptual apparatus must 
be functionally related to the physical 
characteristics of the stimulus. In 
the present case, the perception of 
pitch, the function is usually assumed 
to be a logarithmic one. Hence, 
calling the central effect of stimula- 
tion y, we can write an equation for the 
receptor function, 


y = f(log x). (1) 


To illustrate how these central 
effects may be combined or general- 
ized to form a scale the simplest case 
is probably the one in which two 
stimuli, x; and x,, are presented to S, 
yielding the central effects, y; and 
ya. Then another stimulus, x, chosen 
so that yi < yo < ya, is presented to S 
with instructions to judge it ‘low’ or 
‘high.’ The question is: will x, evoke 
the same response as x; or x? Pre- 
sumably the effects of x; and x, spread 
up and down the stimulus continuum 
so that x, is more likely to be judged 
‘high’ the closer y, is to y, and the 
farther it is from yp. 

In the general case, when S has 
been exposed to many stimuli, xi, 
yielding the central effects, yi, if the 
generalization gradient is symmet- 
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rical, the tendency to call a specific 
stimulus, x,, ‘high’ rather than ‘low’ 
can be written thus: 


Ta-y = Fly, — yi). (2) 


If we are willing to assume that the 
effect of stimulation by any stimulus 
spreads up and down the stimulus 
continuum in a linear way, F be- 
comes a simple constant of propor- 
tionality and we can sum these tend- 
encies for N stimuli: 


TT a_y =k (Ny. = Lyi). (3) 


To locate the point on the contin- 
uum where the resultant tendency to 
respond ‘low’ is equal to the tendency 
to respond ‘high,’ we set this quantity 
at zero and solve for yo. 


zy 


Ye = ar. (4) 


Equation 4 states that the mid- 
point of a subjective scale, the point 
at which a stimulus will be judged 
‘low’ and ‘high’ equally often, is the 
simple arithmetic mean of the central 
effects of the stimuli upon which the 
scale was constructed. 

But these central effects or y-values, 
were assumed by Equation 1 to be 
logarithmically related to the fre- 
quency of the physical stimulus. 
Hence it follows, in the case of judg- 
ments of pitch, that the midpoint of 
the judge’s subjective scale is the 
logarithmic or geometric mean of the 
frequencies of the stimuli to which 
he has been exposed. 


Llog x. 

log x. = ——— (5) 
Thus, we return to the physical 

units of the stimulus objects, and 

Equation 5 is in a form that is im- 

mediately verifiable by experiment. 
A very important assumption of 

this argument is that each stimulus 
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being judged has unit weight in the 
calculations. The necessary condi- 
tions for this assumption are obtained 
if we have the S pay the same atten- 
tion to each stimulus, that is, have 
him judge each stimulus as in the 
method of single stimuli. In other 
words we are taking each judgment 
as one unit of practice, and it is these 
practice effects that are being general- 
ized into a scale of judgment. 

When we are all through we have 
a simple and altogether general theory 
which states that, when a person is 
given a series of sounds with instruc- 
tions to judge each ‘high’ or ‘low,’ 
the boundary between the two cate- 
gories will be at the geometric mean 
of the frequencies which he is judging. 
Scales formed for judgment in other 
modalities will be generalized accord- 
ing to the same principle, that is, 
according to Equation 4, but the 
shape of the receptor function, Equa- 
tion 1, may be different. 


An EXPERIMENT IN PREDICTION 
Procedure 


The experimental test of this theory was 
carried out with the Western Electric 6B audi- 
ometer. The accuracy of this clinical instru- 
ment is good enough for present purposes, be- 
cause each series covers a wide range of frequen- 
cies, from 128 to 1024 cycles per sec., for example, 
so that an error of 5 or even 10 cycles per sec. in 
any one setting would not be crucial, especially 
since we run through the series 10 times and the 
errors are probably random. 

The sounds were roughly equated for loudness 
according to the equal-loudness contours pub- 
lished by Stevens and Davis (9). A preliminary 
check with our S’s and our apparatus agreed 
approximately with the published contours. 
The loudness level was 60 decibels. 

Twenty-five stimuli were used, covering the 
six octaves from 128 to 8192 cycles per sec. 
Since the audiometer dial is graduated in quarter- 
octaves and since the quater-octave is a log- 
arithmic unit convenient for calculation, the 
quarter-octaves were assigned numbers, 1, 2, 
3... 25, and it is these numbers that were 
used in the calculations. 
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TABLE I 


ComPARISON OF PREDICTED AND Ostatnep Limens 
The units are quarter-octaves (q.0.) except in the second column. 








Range in c.p.s. | Range in q.o. 


Shape 


Mean 
Obtained 


Mean Error 
Limen 





128-1024 
128-1024 
512-4096 
512-4096 
512-4096 
2048-8192 
2048-8192 


I-13 
I-13 
9-21 
9-21 
9-21 

17-25 

17-25 











+skew 
rect. 
+skew 
rect. 
—skew 
rect. 
—skew 


2.55 

8.45 
11.49 
14.12 
16.86 
19.83 
21.90 


—1.45 
+1.45 
—0.51 
—0.88 
—1.14 
—1.17 
—0.90 














Seven series were used, each consisting of 9 
to 14 stimuli in steps of one or more quarter- 
octaves. Some of the series were rectangular, 
that is, the stimuli were equally spaced and each 
stimulus was presented ten times. Some of the 
series were skewed, that is, the highest stimulus 
was presented 40 times, the next highest 30 
times, the next 20 times, and soon. Some series 
were similarly skewed in the other direction. 
The skewed series were used as a severe test of 
the generalization theory. Many equations 
might predict the limens of the rectangular series, 
but if our assumption of linear generalization is 
erroneous, our predictions for the highly: skewed 
series will be badly off. 

The sounds were presented in random order 
for 2.5 sec. at intervals of 5 sec. The Ss were 29 
college students, four or five of whom judged 
each series. They were instructed to call each 
sound, including the first, ‘high’ or ‘low.’ They 
were reassured that, although they might be 
uncertain of the sounds at first, they would soon 
catch on, and they should “just call them the 
way they sound.” Category limens were com- 
puted by an adaptation of the method of con- 
stant stimuli. 


RESULTS 


The results are shown in Table I. 
Series A covered the range from 128 to 
1024 cycles per sec. The stimuli 
were distributed in a series with 
positive skew, that is, the lowest fre- 
quency was presented forty times, the 
next thirty times, and so on. The 
theory predicted a category limen in 
terms of quarter-octaves of 4.00. 
The mean of the limens obtained from 
the four Ss was 2.55. Hence the 
discrepancy between theory and fact 
was I.45 on the average. 


Fig. 1 shows the results graphically. 
The coordinates are on a logarithmic 
scale, the vertical being the obtained 
values and the baseline the theoretical 
values. The vertical lines represent 
the range of frequencies used in a 
series and each point represents the 
category limen for one S. Fre- 


quencies above the point were judged 
‘high’ and those below the point were 
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Fic. 1. Comparison of predicted and ob- 
tained category limens for two-category judg- 
ments of pitch by the method of single stimuli. 
The vertical lines represent the range of frequen- 
cies used in a series. The 29 points represent 
the obtained category limens or boundaries, stim- 
uli above which were judged ‘high’ and below 
which were judged ‘low.’ The obtained values 
for each series are plotted above the predicted 
values for that series so that, if theory and fact 
agreed perfectly, all points would lie on the 
diagonal. The correlation between the two 
is .97. 
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judged ‘low.’ The obtained data 
were plotted above the theoretical 
values of the baseline so that, if 
theory and fact agreed perfectly and 
there were no individual variations, 
all points would lie on the diagonal. 

Let us look first at Series D. This 
series of stimuli was a rectangular 
one, covering three octaves from 512 
to 4096 by equal steps of quarter- 
octaves, 13 stimuli in all. Calcula- 
tions from the theory put the cate- 
gory limen between ‘high’ and ‘low’ 
at the geometric mean of the series, 
which is about 1450, so the data are 
plotted above 1450. The five data 
points for this series lie above and 
below the diagonal, with the mean 
(see Table I) slightly below the pre- 
dicted value. 

In Series C the range was the same 
as in Series D, but the sounds were 
concentrated at the low end and the 
predicted limen is therefore lower. 
Two obtained limens are practically 
at the predicted value, and the other 
two are low, so the mean discrepancy 
in terms of quarter-octaves, is 0.51. 
The same range is covered in Series 
E but the stimuli are concentrated 
at the high end and the predicted 
category limen is therefore higher. 
The range, the theoretical values, and 
the obtained values are likewise shown 
for the other series. 


Discussion 


It is obvious from the table and the 
graph that the obtained category 
limens agree very well with the theory. 
Average error in predicting the 29 
obtained values is slightly more than 
one quarter-octave. In comparison 
with the whole range of six quarter- 
octaves this represents an error of 
about five percent. The standard 
error of estimate is about one and 
one-half quarter-octaves, and the 
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correlation between theoretical and 
obtained limens is .97. 

It is worthy of note that these the- 
oretical values permit genuine pre- 
diction and a rigid test of the theory. 
The predicted values can be computed 
before a single subject enters the 
laboratory. We are not fitting a 
curve but computing a single value 
for each series, to which a single 
empirical value can be compared. 

The generality of the theory also 
should be emphasized. The first 
experiment in this series (2) dealt 
with judgments of lifted weights, the 
receptor function for which is un- 
known, and a questionable receptor 
function had to be assumed. But the 
generalization function which pre- 
dicted the category limens for judg- 
ments of lifted weights with fair ac- 
curacy is identical with the present 
Equation 4. The same equation has 
worked quite well in predicting judg- 
ments of success or failure in a penny- 
pitching experiment, and was ex- 
tended to include judgments in four 
categories (4). The present experi- 
ment is perhaps the most convincing, 
because no new assumptions were re- 
quired and it was therefore possible 
to perform all the calculations before 
experimentation was begun. 

By means of Equation § one can 
predict the category boundaries of 
scales based on any kind of series of 
frequencies, not merely those used in 
the present experiment. For ex- 
ample, the effect of a remote anchor- 
ing stimulus, which has come in for 
some investigation recently (6, 8), 
could be easily predicted under some 
conditions. For the series of nine 
sounds from 128 to 512 by quarter- 
octaves the theory would predict a 
midpoint of 256. If an anchoring 
stimulus of 1028 is added to these 
nine and the Ss are required to 
judge all 1o stimuli ‘high’ or ‘low,’ 
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the theory predicts a midpoint of 
about 294, a shift upward of almost a 
quarter-octave. 

Two limitations must be pointed 
out. The theory states how the 
effects of stimulation are combined or 
integrated. But, if one stimulus is 
far removed from the others, it may 
simply not be integrated. It may 
remain perceptually outside the con- 
figuration formed by the other stimuli. 
The pitch continuum is a well-con- 
ceptualized continuum, however, 
hence such discontinuity is less likely 
than in other modalities. 

The second limitation is the ne- 
cessity that all stimuli have equal 
weight in the calculations. If the in- 
structions direct that some stimuli, 
e.g., an anchoring stimulus or a stand- 
ard, be specially emphasized, or over- 
looked, for any reason, a weighted 
mean would be required in place of 
Equation 4. The comparative judg- 
ment as it is usually carried out may 


be interpreted as judgment of a series 
in which the standard has a special 


weight. In an earlier discussion (3) 
it was assumed that the standard 
stimulus would have an attention 
value or weight of half that of the 
comparison stimulus, and calculations 
on this basis agreed quite well with 
published results for weight-lifting 
experiments. 

Professor Harry Helson’s theory of 
adaptation level as a frame of refer- 
ence (1) is similar in most respects to 
the theory discussed above. It “orig- 
inated as a short-hand description and 
explanation of certain fundamental 
phenomena in vision,” and includes 
brightnesses of both sample stimuli 
and background in a weighted geo- 
metric mean. Predicted values from 
his theory agree very well with data 
from a wide variety of experiments on 
loudness of sounds and heaviness of 
weights after suitable constants have 
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been chosen to account for the time- 
order effects and the varying im- 
portance of standard and comparison 
stimuli. For judgments of single 
stimuli as in the present experiment 
Helson’s Equation 3 is applicable: 
log(AL — 0.75d) = = log x,/n. AL 
refers to the adaptation level, which 
may be taken as equivalent to our 
* or category limen. The constant 
0.7§d is introduced empirically to 
account for time-order effects and the 
size of the step-interval. The ra- 
tionale for this constant is not quite 
clear to the present writer, but in any 
event there seems to be no significant 
time error in judgments of pitch (7). 
Eliminating this constant, therefore, 
we find that Helson’s Equation 3 is 
identical with Equation 5 of the 
present discussion. It is encouraging 
to note that two theories of the refer- 
ence scale, one beginning with visual 
phenomena, the other with lifted 
weights, converge to almost complete 
agreement under some conditions. 
Helson’s equation, in fact, is able to 
fit the data of the first experiment of 
this series (2) almost as well as the 
present formulation. (See 1, Table 
VI.) This convergence, and particu- 
larly Helson’s discussion of the breadth 
of application of his theory, illustrates 
once again the power of mathematical 
formulation even when applied to a 
field as old as psychophysics. 

One minor divergence in the two 
statements of the theory should be 
noted. Helson has been interested 
thus far in psychophysical judgments, 
for most of which some sort of log- 
arithmic relationship, as in Equation 
I, is well supported. But if the 
theory is applied to justments of other 
material, such as the prestige of oc- 
cupations (5,6), the desirability of 
various forms of social conduct (6), 
or personal success and failure (4), 
the logarithmic relationship will have 
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to be modified. Such an outcome can 
be expected even if psychophysical 
judgments of lengths of lines are con- 
sidered. For this reason the present 
statement of the theory has treated 
Equation 1 as specific to the material 
of judgment and Equation 4 as a gen- 
eral principle of judgment (or adapta- 
tion). 


CONCLUSIONS 


This experiment verifies the im- 
portance of the reference scale. We 
can make one S call almost any sound 
‘high’ and another S call the same 
sound ‘low’ by presenting each with 
an appropriate series of sounds in 
reference to which their judgments 
are made. 

The obtained data strongly support 
the generalization theory on which 
the predictions are based. The aver- 
age error in predicting the category 
limens is about five percent. The 
correlation between theoretical and 
obtained values is .g7. The effects of 
practice in judging these sounds must 
be combined, or integrated, or gen- 
eralized in something like the manner 
described by Equation 5 or the ob- 
tained values, especially for the highly 
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skewed series, would deviate much 
more from the predictions. 

The success of this experiment and 
the two previous experiments, to- 
gether with Helson’s work, indicates 
that mathematical theory can be 


profitably employed in the investiga- 
tion of the higher mental processes. 


(Manuscript received June 25, 1948) 
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A NOTE ON THE ALTERNATION OF GUESSES 


BY RICHARD L. SOLOMON ! 
Harvard University 


It has often been shown that or- 
ganisms tend to avoid the repetition 
of responses in two-choice situations. 
This generalization applies especially 
to those situations where the response 
tendencies toward each choice are 
equal, due either to previous condi- 
tioning or to limitations in powers of 
discrimination. Experiments demon- 
strating this generalization have been 
discussed at length in a recent review 
(5) and so need not be described here. 
Suffice it to say, whether we observe 
rats in a T-maze, or human subjects in 
psychophysical judgments, number- 
word association series, or E.S.P. ex- 
periments, the avoidance of repetition 
of responses has been observed to 
occur. The theoretical formulation 


which currently appears to explain 
such data is that of ‘reactive in- 
hibition’ (see Hull, 3) or negative 
response-produced drive stimulation 


(5). Such theory places great em- 
phasis upon the time interval between 
responses and the amount of effort 
involved in making each response. 
Briefly, the deductions from Hull’s 
theory would predict that the occur- 
rence of alternation of choices should 
increase as the time interval decreases 
and as the effort increases. Heathers 
(2), using rats in a single unit T-maze, 
has found that the alternation of left 
and right choices decreases in fre- 
quency of occurrence as the time 
interval between runs is increased. 
In an unpublished experiment the 


1 The author wishes to express appreciation to 
Professor Harold Schlosberg, Brown University, 
for his helpful cooperation in the design and 
administration of these experiments. This work 
was performed while the author was an N.R.C. 
Predoctoral Fellow. 


author has found that increasing the 
effort requirement will raise the 
alternation frequency of rats in a 
T-maze. It remains to be seen 
whether the deductions from Hull’s 
theory apply to human two-choice 
behavior. The experiments below are 
preliminary attempts to approach this 
problem. 

Goodfellow’s (1) account of the re- 
sults of the Zenith Radio E.S.P. ex- 
periment contains adequate evidence 
that human subjects tend to alternate 
guesses where there are two alterna- 
tives. Skinner (4) has recognized 
this tendency in analyzing Good- 
fellow’s data. If this human ‘guess- 
ing’ behavior follows the same laws as 
revealed by Heathers (2) with rats in 
a T-maze, then we might expect that 
the avoidance of repetition of guesses 
in two-choice E.S.P. situations should 
increase in frequency as the time 
interval between guesses is decreased. 
In addition, we might expect that if 
the level of effort involved in making 
a choice-response is increased the 
level of alternation should increase: 


First ExPERIMENT 


Subjects —The subjects in these experiments 
were 192 students, divided randomly into two 
parallel lecture sections of the course in elemen- 
tary psychology at Brown University. There 
were both men and women in both groups, but 
no attempt was made to analyze the perform- 
ances on the basis of sex. The subjects were 
naive with respect to the purposes of the experi- 
ment. Since their usual lecturer * conducted the 
experiment as a regular demonstration, the class 
cooperated very well. 

Apparatus.—No apparatus was required for 
this experiment. The items needed were coins 
for the professor to hold, and slips of paper, 


2 Professor Schlosberg 
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with numbered lines, one through five, stamped 
on the slips. The numbers were spaced, one 
under another, in such a way that the slip could 
be folded to cover up each written response after 
the response was made. 

Procedure and Results —The experiments were 
conducted ostensibly as a part of alecture. The 
professor explained to the class the idea of 
‘extrasensory-perception’ and then stated that it 
would be possible for the class to test the hy- 
pothesis. Response-slips were distributed to all 
members of a class, and directions for making 
guesses were given. The subjects were told that 
the lecturer had a stack of five pennies in his 
hand, and would look at them, one at a time, 
while trying to ‘send’ the information to the 
students by means of ‘E.S.P.’ The lecturer 
warned the class when he was looking at a 
penny, and told them to write on the slips of 
paper the word ‘heads’ or ‘tails,’ depending on 
which side of the coin the student thought the 
lecturer was staring at. Thus the students’ 
problem was, on the surface, trying to guess 
whether a coin in the professor’s hand showed 
heads or tails. (The sequence o{ coin faces was 
always T, H, H, T, T; but we were not actually 
interested in how well the class could guess, and 
so the success in terms of ‘E.S.P.’ will not be re- 
ported here.)# The first class of students, num- 
bering 83, made guesses at intervals of 15 sec. 
There was an error of about five sec. in this time 
interval, due to the limitations of the classroom 
situation; someone was always tardy in folding 
the paper, writing the response, etc. After each 
response, the subjects were warned to fold over 
the slip of paper immediately in such a way that 
the previous responses were all invisible. Thus, 
each guess was made without the written word, 
representing the previous guess, being in sight. 
Five guesses were made. The same procedure 
was followed in the second lecture section, which 
numbered 99 students, except that guesses were 
made at eight-min. intervals. On a signal from 
the writer, posted in the rear of the room, the 
lecturer interrupted his lecture to say ‘ready,’ 
and looked at a coin. The subjects wrote down 
their guesses of heads or tails, folded their papers, 
and returned to the note-taking activities at- 
tendant to standard lectures in Psychology. 
Seventeen days after these experiments, the 
same classes were presented with very much the 
same type of situation, with the following modi- 
fications. The first class (this time numbering 
89) was given the same directions used in the 
first experimental session, but they were in- 
structed to write their responses or guesses with 


3 The data were tabulated for report back to 


the class. There was no indication of the pres- 
ence of ‘E.S.P.’ 
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their non-preferred hand. Their guesses were 
to be written out as: “I think it was heads” or 
“IT think it was tails.” This was done to increase 
the ‘effort’ involved in making the guess- 
responses. The time interval between responses 
varied between 45 and 60 sec. The 15-sec. in- 
terval was impossible to duplicate because the 
guess-responses required more time than in the 
first classroom session. The second class (this 
time numbering 103 students) also went through 
the same type of situation 17 days after the first 
session, and was acontrol group. The procedure 
was identical from the first to second experiments 
for this group, with one exception: the time in- 
terval between responses was, because of a late 
start, seven min. instead of the original eight- 
min. period. 


The responses of all the subjects 
were tabulated, and the number of 
avoidances of repetition of guesses was 
counted. The results for the four 
experiments are tabulated in Table I, 
in terms of the number of subjects 
showing specified frequencies of non- 
repetitive responses in the series of 
five responses. The maximum possi- 
ble frequency was four alternations in 
five responses. Data are also pre- 
sented on the percent above chance 
alternation, represented by each per- 
centage of alternation, together with 
the probability that such a level of 
alternation could have occurred on 
the basis of chance alone. We note 
that all classes, under all conditions, 
avoided repeating responses to an 
extent which exceeded chance ex- 
pectations by a significant amount. 
The differences between the groups 
were not significant, however, as 
determined by the x? test. Thus 
there was a marked tendency to 
avoid repeating ‘just-made’ responses 
in this experimental situation. But 
the variations in time interval and the 
introduction of an ‘effortful’ response 
was not significant variables in the 
present experiment. 

The data in Table I do not fit our 
expectations from theory. On the 
basis of theoretical considerations, 
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TABLE I 


NumBer oF Supyects Maxinc Speciriep Frequencies oF Non-Repetitive-GuEssEs 
IN THE Four Crassroom Psevupo ‘E.S.P.’ Experiments 








First Sessions 


Later Sessions 





Number of Alternations 


15-Sec. Intervals 


8-Min. Intervals 


‘Effort’ with 
45-60 Sec 


b 7-Min. Intervals 
Intervals 





Totals 

Percent Alternation 

Percent above Chance * 
Alternation 9.0% 

x 24.27 

P .00008 
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* Chance level of alternation is computed on the basis of empirical frequencies of heads and tails 


responses. 


This figure for five responses, is 8HT, where H is the actual percent of heads guesses and 


T the actual percent of tails guesses obtained from all subjects. 


and the results of Heathers (2), we 
would expect that the eight-min. 
group would have alternated sig- 


nificantly less than the 15-sec. group. 


We also might expect that the 
‘effort? group should have avoided 
repetitive responses more than the 
seven-min. control group. Such was 
not the case. The most probable 
explanation for these results lies in the 
fact that the responses were verbal- 
ized, and certain private conceptions 
about the laws of chance probably 
determined a large body of the stu- 
dents’ guesses. There is some evi- 
dence that the idea of a 50-50 chance 
of having héads or tails come up in the 
toss of a coin implies to many students 
that the coin surfaces will tend to 
alternate. By this means the stu- 
dents can keep the number of heads 
guesses about equal to the number of 
tails guesses, thus fulfilling the 50-50 
expectations. 


Seconp ExpERIMENT 


Subjects—The subjects in this experiment 
were 58 students enrolled in the Elementary 


Psychology course at Brown University. There 
were three women in this group. The subjects 
were naive with respect to the purpose of the 
experiment. They were cooperative and ap- 
peared to be trying to perform precisely as the 
experimenter directed. 

Apparatus.—The apparatus used in this ex- 
periment was a small wooden box upon which 
two identical levers were mounted. A photo- 
graph of this apparatus is shown in Fig. 1. The 


Fic. 1. Photograph showing the lever-press- 
ing apparatus used in ‘E.S.P.’ experiments. 
The photograph was taken from the subject’s 
side of the apparatus. The pressing-keys and 
the pushbutton switches may be seen. The 
mercury switch and Cenco counter are obscured 
since they are mounted on the rear panel. 
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levers were six in. long, pivoted by means of a 
pillow block and steel shaft 44 in. from the 
‘pressing-plates’ which were mounted on the tips 
of the levers. The plates were circular, 2} in. 
in diameter, and were bolted securely in position. 
The levers were of aluminum stock, 6 by ? by 
sin. The other ends of the levers were fastened 
to the bottom of the lever box by means of long 
helical springs set to give a resistance of 7} 
pounds at the limit of the excursion of the levers. 
Directly under each press-plate or key was a 
small closed-circuit push-button switch. The 
two switches were connected in series with one 
another, and operated a Cenco impulse counter. 
Also in series with this circuit was a mercury 
switch mounted on the rear of the box. This 
mercury switch was silent and enabled the ex- 
perimenter to close the circuit and start the im- 
pulse counter. A transformer in the base of the 
apparatus reduced the 110 volt, 60 cycle A.C. 
to the correct voltage for the counter. The buzz 
of the counter served as a stimulus for depressing 
one of the levers. At the end of its stroke the 
lever opened the push switch, and stopped the 
counter. 

Procedure and Results—The subjects were 
told that they were to take part in an ‘E.S.P.’ 
experiment. The theory of extra-sensory-per- 
ception was explained during a regular classroom 
lecture period. When a subject appeared at the 
laboratory for his appointment with the experi- 
menter he was ushered into a small room contain- 
ing two chairs and a small table. The apparatus 
was on the table. The subject sat in the chair 
facing the front of the apparatus, opposite the 
lever plates. The experimenter sat at the other 
side of the table, facing the back of the apparatus 
on which the mercury switch and counter were 
mounted (see Fig. 1.) So that the subject could 
not see the face of the experimenter, a piece of 
cardboard was tacked to the back of the ap- 
paratus, separating the subject and the experi- 
menter. The experimenter explained to the 
subject that he was going to say ‘ready’ and then 
think about which lever he wanted the subject 
to press. The subjects were instructed to use 
the index and third finger for pressing the lever, 
and they were told to keep these fingers resting 
on the levers at all times. The experimenter 
further explained that the subject was to react 
as quickly as possible with a pressing of the right 
or left lever as soon as he heard a buzzer sound. 
He was told that pressing the lever, no matter 
whether it was the left hand or right hand one, 
the correct or incorrect one, would terminate the 
buzzer. This procedure enabled the experi- 
menter to control the time between responses 
with an error equivalent to the variability of 
auditory reaction times. The subjects were 
given six trials; 30 subjects were given 25 sec. 
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between trials, and the remaining 28 subjects 
were given only five sec. between trials. Reac- 
tion times were recorded to check on the vari- 
ability of time intervals, but were not used in the 
analysis of results. In general, the error in time 
between responses imposed by auditory reaction 
times was negligible. 


The responses were recorded in 
order, either as right hand or left 
hand. The maximum possible num- 
ber of non-repetitive-responses in this 
experiment was five. Table II shows 
the frequencies of occurrence of differ- 
ent numbers of non-repetitive-re- 
sponses for all the subjects. The 
table reveals that the group having 
25 sec. between responses showed 
more alternations than did the five- 
sec. group. The difference between 
these groups was not significant, so 
the two groups can be considered as a 
single sample. The percent non- 
repetitive-responses for all subjects 
was 59.0 percent. The number of 
right hand responses on the first trial 
was 35 out of 58 total responses on 
the first trial, or a preference for right- 
pressing of 61 percent. For all six 
trials, this preference was 56 percent. 


TABLE II 
Frequency oF Occurrence oF Non- 


REpETITIVE-RESPONSES FOR SUBJECTS 
in Lever Pressinc ExpeRimENT 








Frequency 
for 25-Sec. 
Group 


Frequency 
for 5-Sec. 
Group 


Total 
Frequency 


Number of 
N-R-R’'S 





Totals 


Percent 
Alternation 








61.3% 56.5% 59.0% 








Discussion oF RESULTS 


The results of the second experi- 
ment were strikingly similar to those 
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of the previous experiment on coin- 
guessing. The absolute level of the 
percent non-repetitive-responses in 
both experiments was approximately 
60 percent. There was a marked 
preference for one of the two alterna- 
tives in each experiment; responses of 
‘heads’ on the one hand, and ‘right’ 
on the other, were predominant. 
And, in general, the results of both 
experiments probably indicated that 
the responses were reflecting the sub- 
jects’ conception of ‘chance’ rather 
than basic response mechanisms. 

The predictions made on the basis 
of reactive inhibition postulates or 
response-produced negative drive 
stimuli were not substantiated by our 
experimental evidence. On the basis 
of such theories we would expect that 
there would have been a higher level 
of alternation for the five-sec. group 
than for the 25-sec. group. The level 
of effort imposed by the springing of 
the two levers would be expected to 


raise the general level of non-repeti- 


tive-responses above that of the 
coin-guessing experiments. Neither 
of these expectations were borne out. 
One conclusion would appear to be 
that the subjects were responding on 
the basis of higher verbal processes, 
involving personal conceptions of the 
nature of ‘chance.’ Due to this con- 
sideration, the experiment may not 
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be a legitimate test of Hull’s hypoth- 
eses. We should add, however, that 
the introduction of effort into the 
problem of guessing one of two alter- 
natives might still affect the level of 
alternation if the task involved large 
portions of the body musculature. 
The problem in demonstrating the 
operation of negative drive stimula- 
tion would appear to involve the use 
of a task that was so effortful that it 
would outweigh the verbal biases of 
the subjects. Obviously, this was 
not accomplished in either of our 
two-choice ‘E.S.P.’ experiments. 
Such an experiment should be per- 
formed in order to test the applicabil- 
ity of reactive inhibition theorems to 
human behavior. 


(Manuscript received July 23, 1948) 
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THE EFFECT OF MOTIVATING CONDITIONS 
ON THE ESTIMATION OF TIME 


BY ROBERT J. FILER AND DONALD W. MEALS! 
University of Pennsyloania 


INTRODUCTION 


The distance one travels toward a 
much desired destination frequently 
appears to be greater than the actual 
distance. Thus, an individual often 
experiences a short journey as ex- 
ceptionally long when he desires 
greatly to reach his objective. Simi- 
lar observations are no less common 
in judgments of time. While one is 
waiting for a late train or for an im- 
portant phone call, a few minutes of 
time appear to be hours. “It seems 
like hours” is a common expression 
of the person anxiously waiting for an 
important event. 

There have been numerous studies 
of temporal perception, but few have 
been related to the effects which 
motivation has upon judgments of 
duration. A study by Harton (2) 
indicates that the perception of time 
may be partly the result of tension. 
Rosenzweig and Koht (4) have related 
the estimation of duration to need- 
tension and have demonstrated that 
Ss under need-tension tend to estimate 
durations as less than objectively 
equal intervals free of need-tension. 
Presumably Ss in this investigation 
desired the time of the period in 
which they were asked to do an un- 
solvable task to be longer so they 
could continue to work on the task. 

From observations noted above it 
seems that individuals who desire the 
passing of an interval of time because 
they will reach a goal at the end of 
that interval will tend to overestimate 


1 The writers would like to express their appre- 
ciation to Dr. Francis W. Irwin for his valuable 
suggestions during this experimentation. 


the duration when they are asked to 
make judgments before the end of the 
period. Thus, one minute might be 
estimated as two when the original 
goal was at a temporal distance of 
five minutes. Experiments described 
below investigate the hypothesis that 
Ss who are motivated to have time 
pass will estimate a given period of 
time to be of longer duration than will 
Ss who are not so motivated. 

Two experimental procedures in 
which it was believed Ss desired a 
stated period of time to pass, and a 
control period in which it was believed 
that there was no particular motiva- 
tion for the passing of a stated time 
interval, were utilized in the investiga- 
tion. In the first experimental pro- 
cedure, it was assumed that college 
students liked to get out of class early 
and would desire the end of a task 
which had a time limit and which 
hindered them from leaving class. 
In the second experimental procedure 
it was assumed that if Ss believed 
they had a good chance of obtaining a 
prize at the end of period of time they 
would desire that time to pass. Be- 
cause Ss were led to believe that the 
task they were asked to do was one 
many could do in the given time, it 
was thought they would associate the 
prize to be awarded for doing the task 
with the end of the time period, and 
thus would desire the completion of 
the time interval. There was no 
attempt made in the control procedure 
to motivate the Ss by giving them a 
goal that could be obtained at the 
end of a time interval as in the experi- 
mental procedures. 
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EXPERIMENTAL PROCEDURES 
A. Experimental Group I 


Subjects.—This group consisted of three un- 
dergraduate psychology classes. The total num- 
ber of Ss was 67. One of the classes was a 
women’s class of 24 Ss and another was a class 
of 18men. The third was a mixed class of 25 Ss. 
Because it was planned to use other similar 
groups, they were asked not to discuss the pro- 
cedures with any other students outside the 
class. 

Procedure.—At the end of a two-hour labora- 
tory session the members of the class were asked 
to cooperate in a short experiment. Class ses- 
sions which were concluded at least one half 
hour before the regular time were utilized. 
Each of the Ss was given a sheet of white paper 
of 8 X 114 in., after which the following instruc- 
tions were read to the class as a group. 


You are now to perform a 10-minute task. 
When you have finished you will be able to 
leave for the day. On the sheet of paper pro- 
vided you are to write down as many words as 
you can think of using alternate letters of the 
alphabet, beginning with A, as the first letter 
of the words. That is, you may use A, C, E, 
G, etc., as the first letter but not B, D, F, ete. 
Write as many words as you can think of be- 
ginning with A. When you have written all 
the words you can think of beginning with A 
goontoC. When you have finished with one 
letter skip a letter and goon tothe next. You 
may use proper names. Are there any ques- 
tions? This is a 10-minute task. You will 
start when I say ‘go’. As soon as you have 
finished you will be through for the day. 


After four minutes and 37 seconds? Ss were told 
to stop working, as follows: 


Stop! Everyone stop working! Do not 
look at your watches. On the back of your 
paper in the lower right hand corner estimate 
the amount of time you have been working at 
this task, that is, the time from the signal go 
until you were stopped. The task was not 
actually 10 minutes in length. Indicate your 
estimated time in minutes and seconds. If 
you have looked at your watch at any time 
after you started until now please make a note 
of it beside the time estimate. 


Only three Ss from all procedures noted that 
they had observed a watch. The results of 
these Ss are not considered with the data dis- 
cussed below. 


? This time was somewhat arbitrarily selected. 
It was decided that the Ss should be well into 
the task, and at the same time it was felt that 
the interruption should not be too close to the 
end of the 10-minute interval. 
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B. Experimental Group II 


Subjects Undergraduate students in psy- 
chology classes were used. This group consisted 
of 71 Ss from two mixed classes, and was com- 
posed of 44 women and 27 men. The Ss were 
asked not to discuss the procedure after the class 
session. 

Procedure.—The experimental procedure was 
conducted at the beginning of the class hour. 
Each S was given a sheet of white paper 8 X 114 
in. in size, and the following instructions were 
read. 


We are going to ask you to do a 10-minute 
task for which you can win a prize. It has 
been found that many people can do this task. 
This is actually a contest against time. Each 
of you who wins will receive a box of candy 
exactly like this one. [A box of chocolates was 
shown to the group. ] 

You are to perform a 10-minute task. On 
the sheet of paper provided you are to write 
down as many words as you can think of using 
alternate letters of the alphabet beginning with 
A as the first letter of the words. That is, you 
may use A, C, E, G, etc. as the first letter but 
not B, D, F. Write as many words as you 
can think of beginning with A. When you 
have written all the words you can think of 
beginning with A goontoC. When you have 
finished with one letter skip a letter and go on 
to the next. You may use proper names. 
Each of you who succeeds in writing 150 
words or more within this 10-minute period 
will receive a one pound box of Whitman’s 
chocolates. Are there any questions? 


After 4 minutes and 37 seconds, the Ss were 
requested to cease working as follows: 


Stop! Everyone stop for just a moment. 
Do not look at your watches. In the upper 
right hand corner of your paper write your 
estimate of the amount of time you have been 
working at this task. That is—the time from 
the signal begin until you were stopped. The 
task was not actually 10 minutes in length. 
Indicate the estimated time in minutes and 
seconds. If you have looked at your watch 
at any time after you started until now please 
make a note of it beside the time estimate. 


After the Ss had made their estimates they 
were asked to write their name and sex on the 
paper, and were informed that as they were not 
allowed to complete the task, the person who 
wrote the largest number of words would be 
awarded the box of candy. 


C. Control Group 


Subjects.—Sixty undergraduate students, 31 
women and 29 men, constituted the control 
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group. These students came from two psy- 
chology classes at the University of Pennsylvania. 

Procedure.—The instructions for the control 
group were similar to those given to experimental 
group I, except that Ss were told that when they 
finished the task they would go on with the class 
work. The task was administered at the be- 
ginning of the class session. Ss were interrupted 
after working 4 min. and 37 sec., and were asked 
to estimate the length of time they had been 
performing the task. Again, Ss were asked not 
to discuss the experiment outside of class. 


REsULTS 


It will be noted that the obtained 
results are as predicted. The mean 
time estimate for the control group 
was 290.2 sec. and the SD of the dis- 
tribution was 108.3 sec. For the 31 
women in the control group, the mean 
estimate was 293.9 and for the 29 men 
it was 286.2. This difference of 7.7 
sec. would occur by chance 97 times 
in 100. 

Experimental group I (permitted 
to leave class early) had a mean time 
estimate of 326.5 sec. and an SD of 
77.2 sec. The difference of 36.3 sec. 
between the control group and ex- 
perimental group I means was sig- 
nificant at the .o1 level. For experi- 
mental group II (offered prize) the 
mean time estimate was 330.5 sec. 
and the SD was 126.1. The differ- 
ence of 40.3 sec. between the control 
group and experimental group II was 
significant at the .o2 level. Because 
the direction of the difference was as 
predicted, only one tail of the error 
curve was used to test the significance 
of the differences. 

Data of the experiment revealed 
other interesting results not directly 
related to the hypothesis under in- 
vestigation. Under the motivating 
conditions for the experimental 
groups, the number of words written 
by the Ss increased significantly. 
The difference between the mean 
number of words written by the 
control group and experimental group 
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I was significant beyond the .o1 level, 
and the difference between the mean 
number of words written by the con- 
trol and experimental group II, also, 
was significant at the .o1 level. This 
might be expected to be true for 
experimental group II, since these Ss 
were told that if they wrote 150 words 
in 10 minutes, they would win a 
prize; but the increase was also sig- 
nificant for experimental group I 
where a goal of writing a definite 
number of words was not explicit. 
It further appeared that there was 
no relationship between the number 
of words written and time estimations. 
Pearson product-moment r’s between 
words and time for the control and 
the two experimental groups were in 
all three cases close to zero. Cor- 


relating words against time for the 
results obtained from experimental 
group I yielded a coefficient of —.28. 
The coefficient of correlation for words 
against time from results of experi- 


mental group II was +.19. In the 
case of the control group the coeffici- 
ent was —.18. 

There seemed to be some tendency 
for women to be more affected than 
men by the experimental conditions, 
but the difference between men and 
women in the experimental groups was 
not significant. 


Discussion 


The results from the experiment 
confirm the hypothesis that individ- 
uals motivated to complete a task 
will believe that they have worked 
longer at the task than individuals 
not so motivated when these individ- 
uals are interrupted before completion 
of the task, after both groups have 
worked an equal amount of time. 

To account for the results it can- 
not be argued that tension per se will 
cause Ss, who are motivated to have 
time pass, to estimate a given period 
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of time to be of longer duration than 
will Ss who are not so motivated. 
Rosenzweig and Koht have shown the 
reverse to be true under particular 
conditions, as was indicated in the 
introduction. Since there is no sig- 
nificant correlation between the num- 
ber of words written and the time 
estimate, the fact that Ss wrote more 
words under the experimental condi- 
tion will not account for the increased 
estimation of time. 

The writers suggest the following 
explanation which should be investi- 
gated further. Wright (5), Irwin 
and Gebhard (3), and others, have 
presented evidence that psychological 
distance tends to affect the attractive- 
ness of a goal. It seems likely that 
the attractiveness of a goal will affect 
the psychological distance one must 
travel to reach that goal. The ex- 
perimental groups in our study were 
motivated so that they desired the 
completion of a time interval. Thus, 
when they were interrupted and 
asked how long they had been work- 
ing, they by their estimation placed 
themselves closer to the end of the 
ten-minute interval than Ss who 
presumably had no special desire for 
the time to be completed. Motivated 
Ss, by overestimating the time, placed 
themselves nearer the moment they 
could leave class or win a prize. 

There is probably some relationship 
between the function the distance 
plays in reaching a goal and any 
estimation of distance traveled. 
Thus, if the passing of time interfers 
with the attainment of a goal, results 
such as Rosenzweig and Koht found 
might be expected. The effect that 
reaching an undesirable event at the 
end of a given distance will have upon 
estimations of distances also needs to 
be investigated. 

Working with rats, Crutchfield (1) 
reported that under varied degrees of 
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motivation psychological distance for 
the animals was influenced. Blind 
rats were trained to travel a runway 
and to turn in at a side alley six feet 
from a starting point for food. Re- 
sults from the experiment indicate 
that the longer the rats were deprived 
of food, the farther they traveled 
down the runway before making a 
turn into a side alley. Because of the 
results obtained in our experiment, 
the reverse might have been expected, 
i.e., the more deprived of food the 
rats were, the more they would over- 
estimate the distance traveled and 
thus the hungrier rats would turn 
before the less hungry ones. Crutch- 
field concludes that if the intensity of 
a need is increased the distance to the 
goal should be acted to as if it were 
increased. There are several sig- 
nificant differences between Crutch- 
field’s experiment and our investiga- 
tion which raise questions about 
comparing results. One is the differ- 
ence in Ss and another is the medium 
considered to separate the Ss from 
their goals. Ours involved time, 
while the rats had to travel a physical 
distance. The motivation in Crutch- 
field’s work was certainly different 
from the motivation reported in this 
article. Also our Ss were interrupted 
and asked to view their distance retro- 
spectively. Whether the rats did 
this is open to question. 


SUMMARY 


The hypothesis that Ss who are 
motivated to have time pass will 
estimate a given period of time to be 
of longer duration than will Ss who 
are not so motivated was confirmed 
by results obtained under two inde- 


pendent experimental conditions. 
The possibility that an attractive 
goal affects the psychological distance 
to the goal has been suggested tenta- 
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tively as an explanation of the 
observations. 


(Manuscript received July 12, 1948) 
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A FURTHER ANALYSIS OF THE VARIABLES IN 
CYCLICAL MOTOR LEARNING 


BY GREGORY A. KIMBLE 


Brown University 


INTRODUCTION 


In a previous paper, Kimble and 
Bilodeau (3) presented a preliminary 
analysis of the separate and joint 
effects of the work and rest variables 
in cyclical motor learning. Their 
findings may be summarized as fol- 
lows: 


1. Of the two variables, the length of the 
work period is the more important in that it 
contributes a greater score difference than rest 
when we are dealing with equal units of work 
time and rest time. 

2. There is no interaction between the two 
variables. That is, the effect of changing the 
magnitude of one does not depend upon the 
value of the other. 

3. The score differences produced by a varia- 
tion in the length of the practice trial are ap- 
parent at the end of the first learning trial and 
are maintained at a fairly constant level from 
that time on. What this means is that lengthen- 
ing the trials in the motor learning experiments 
has the effect of producing an immediate and 
constant performance decrement. 

4. Score differences produced by a variation 
in the amount of rest between trials are an in- 
creasing function of the amount of previous 
practice. The rest variable does not help deter- 
mine the level of performance to any important 
degree until after two or three min. of practice 
have elapsed. 

5. The joint effect of simultaneous variation 
of both work and rest is the simple summation 
of the separate effects obtained when the two 
variables are varied one at a time. 

These 


results seem 


important 
enough to warrant an attempt at 


verification. Furthermore, Kimble 
and Bilodeau gave only a very tenta- 
tive answer to the question of the 
nature of the function according to 
which the separate effects of work and 
rest (items 3 and 4 in the summary 
above) cumulate with practice. The 
analysis to be presented in this paper 


gives support to the general conclu- 
sions drawn from the results of the 
Kimble and Bilodeau experiment. 
It provides us with further information 
on the nature of the functions in- 
volved. And, finally, it serves the 
useful purpose of demonstrating that 
the previously reported findings are 
not limited to a particular learning 
task or to a particular selection of 
work and rest intervals. 


SourcE OF THE Data To 
BE ANALYZED ! 


The analysis of the work and rest variables in 
motor learning to be presented in this paper is 
based on data obtained by Kientzle (1) and by 
Kimble (2). Both of these investigators used 
an alphabet printing task in experimental inves- 
tigations involving substantial numbers of Ss. 
For our present purposes, the important differ- 
ence between the two experiments was that 
Kientzle used 60-sec. practice trials whereas 
Kimble used 30-sec. practice trials. Both in- 


TABLE I 


SumMMARY OF THE Work-Rest ConpiITIONS IN 
THE Four ExperimeENTAL Groups TO 
BE Usep IN THE ANALYSIS 





) 


Group Investigator 





1 Kientzle 52 
2 Kientzle 56 
3 Kimble 39 
4 Kimble 46 














* The first number is the number of sec. of 
practice. The second number is the number of 
sec. of rest between practice trials. The nota- 
tions that appear in this column will be used in 
the text ‘and on the graphs to refer to the four 
experimental conditions. 


1Thanks are due Dr. Mary J. Kientzle for 
her permission to use the data from two of her 
experimental groups. 
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vestigators ran groups which were allowed 5 sec. 
and 30 sec. rest between trials. Thus data from 
four groups of Ss were available. In Table I the 
relevant facts on the experimental conditions 
and numbers of Ss in each group are presented. 

Another difference between the two experi- 
ments which somewhat restricts our analysis is 
that, for experimental reasons unessential to our 
present inquiry, Kimble’s Ss were given only 
10 min. (20 30-sec. trials) of practice. Our 
analysis, therefore, must also be limited to the 
first 10 practice trials in the case of the data 
obtained by Kientzle. 


RESULTS 


In all of the results to be reported, 
we shall be dealing with scores or 
score differences as a function of 
minutes of practice. Since Kimble’s 
Ss were given 30-sec. practice trials, 
scores for successive pairs of trials 
have been added in the case of the 
30-5 and the 30-30 conditions to 
obtain scores for a full minute of 
practice. In the case of the data 
from the Kientzle experiment, the 
scores are mean scores which she 
reports for 10 successive minutes of 
practice. 


70 


60 


SCORE 


50 


MEAN 
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Our results consist of four graphs 
and one table which have been de- 
vised to present the necessary infor- 
mation on the effect of the work and 
rest variables on cyclical motor learn- 
ing. Learning curves for the four 
groups are presented in Fig. 1. The 
notation on each curve gives the 
number of seconds of work and rest 
(in that order) in the experimental 
condition for which the curve has 
been constructed. 

Some of the more important results 
of our analysis can be seen directly in 
the relationships among these curves. 
Note first of all that the two curves 
involving 30 sec. of practice per trial 
begin at a higher point than those for 
conditions where the trials are 60 sec. 
long. Note also that the area be- 
tween the curves for the 30-5 and the 
60-5 conditions seems to be relatively 
constant throughout practice. The 
same relationship obtains between the 


curves for the 30-30 and the 60-30 


conditions. Next observe that, in the 
curves representing the performance of 


| i | | J 





3.64 


5 6 7 8 9 10 


MINUTES OF PRACTICE 


Fic. 1. 
practice. 
in that order. 


Learning curves showing mean numbers of letters printed during successive min. of 
The notations on the curves refer to number of sec. of practice and number of sec. of rest 
Points for the 30-sec. practice groups were obtained by adding scores for two trials. 
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TABLE II 


Tue Errect oF a 30-Sec. DirFERENCE IN THE LENGTH OF THE Practice TRIAL 
Wuen THE LenctH or THE Rest Periop 1s 5 Sec. or 30 Sec. 
Mean Minute sy Minute Score DirFerences 








Conditions C d 


Minutes of Practice 





Y 








(30-5 )}—(60-5) 
(30-30)—(60-30) 




















the two 30-sec. or the two 60-sec. work 
conditions, ‘the initial points are at 
almost identical ordinate positions. 
It is only after about four min. of 
practice that the curves become 
clearly separated. When the separa- 
tion occurs, the curve for the condi- 
tion with the longer resting time be- 
comes superior. 

Fig. 2 and Table II are designed to 
show the absence of any interactive 
effect between the work and rest 
variables. In Fig. 2, trial by trial 
score differences resulting from a 
difference in the length of the inter- 
trial rest pause are plotted. These 
differences were obtained by subtract- 
ing mean scores for the conditions 


sf- 


(30- 30)-(30- “ 


(60-30)-(60- 5) 





SCORE DIFFERENCE 


aS aS a 
'234 56 7 8 $ 10 


NO- OF MINUTES OF PRACTICE 


Fic. 2. Graph showing the increasing score 
difference resulting from a 25-sec. difference in 
the length of the rest period and the absence of 
any interaction between the effects of work and 
rest. The points plotted are the minute by 
minute score differences for the groups indicated. 





allowing the shorter rest from the 
mean scores of the condition allowing 
the longer rest. Since there were two 
different lengths of practice, two 
difference curves could be obtained. 
The curve with the open circles is a 
curve of the successive score differ- 
ences between the 30-5 and 30-30 
conditions. The curve with the 
closed circles is a similar curve for the 
two 60-sec. practice conditions. Note 
particularly that the courses of the 
twocurves seem tobethesame. This 
is taken as evidence that the effect of 
a 25-sec. rest difference is the same 
for a 30-sec. as for a 60-sec. practice 
condition. 

In Table II, data are presented 
which show that the effect of a vari- 
ation in the length of the practice 
trial is nearly identical for two condi- 
tions allowing different amounts of 
inter-trial rest. The top row of the 
table shows the score differences ob- 
tained by subtracting the scores of 
the 60-5 condition from those of the 
30-5 condition trial by trial. The 
bottom row shows the results ob- 
tained when the 60-30 scores are 
subtracted from those for the 30-30 
condition. The near identity of the 
two sets of values indicates that the 
effect of a variation in the length of 
the work period is independent of the 
length of the rest period. 

Reference again to Fig. 1 will show 
that the 30-30 condition produced the 
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most efficient learning. The fact 
that the other curves are lower than 
this one is taken as evidence that a 
longer practice period or a shorter 
inter-trial rest period, or both, re- 
duced the level or performance. Fig. 
3 has been constructed to show the 
amount of this reduction. The points 
plotted were obtained by subtraction, 
trial by trial, of the mean scores in 
the other conditions from those of the 
30-30 condition. The _ resulting 
curves show us the effect throughout 
learning of a variation in one or both 
of our independent variables. The 
curve which starts at the lowest point 
shows the effect of decreasing the 
length of the rest pause by 25 sec. 
It was constructed by plotting the 
mean score differences between the 
30-30 and 30-5 conditions. It de- 


monstrates that the effectiveness of a 
25-sec. difference in the length of the 
inter-trial rest increases with practice. 
The shape of the curve is roughly 


ogival. 

The intermediate, relatively hori- 
zontal, curve shows the effect of 
introducing a 30-sec. difference in the 
length of the practice trial. It was 
constructed by plotting the mean 
score differences between the 30-30 
and the 60-30 conditions. The out- 
standing characteristics of the curve 
are the immediacy of the occurrence 
of the indicated score difference and 
the flatness of the curve. This curve 
demonstrates that the effect of length- 
ening the practice trial is to produce 
an immediate and constant decrement 
in performance. 

The upper curve shows the effect of 
decreasing the length of the rest 
period and increasing the length of 
the practice trial by 25 sec. and 30 
sec. respectively. It was constructed 
by plotting the mean score differ- 
ences between the 30-30 condition and 
the 60-5 condition. Note that, in 


a 


12h (30-30)—(60-5) 


(30- 30)— (60-30) 


SCORE DIFFERENCE 


(30-30)— (30-5) 





-3 l j 
2 6 93 10 


' 
NO-OF MINUTES OF PRACTICE 


Fic. 3. Graph showing the effect of (a) a 
25-sec. difference in the length of the rest period 
(curve starting at lowest point), (b) a 30-sec. 
difference in the length of the practice trial 
(intermediate, horizontal curve) and (c) a simul- 
taneous variation of work and rest by 30 sec. and 
25 sec. (upper curve). The plotted points were 
determined by performing the subtractions indi- 
cated in the notation on each curve. 





shape, this curve resembles that which 
shows the effect of the rest difference 
alone, but that it is displaced above 
it by an amount roughly equal to the 
score difference contributed by a 
variation in the length of the work 
period. It is these relationships that 
suggest that the separate effects of 
work and rest add to produce their 
joint effect. 

In Fig. 4, the additivity of the 
effects of work and rest is tested. 
Recall that the top curve of Fig. 3 
represented the effect of a simultane- 
ous variation of 30 sec. in the length 
of the practice trial and 25 sec. in the 
length of the inter-trial rest. This 
curve has been redrawn in Fig. 4 and 
labelled, ‘observed difference.’ Indi- 
vidually the two remaining curves in 
Fig. 3 illustrated the effect of shorten- 
ing the rest period by 25 sec. and 
lengthening the work period by 30 
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sec.; i.e., individually, they represent 
the same differences as are represented 
jointly in the top curve. In Fig. 4, 
the curve labelled, ‘calculated differ- 
ence,’ is the trial by trial summation 
of the separate difference scores for 
the work and rest functions in Fig. 3. 
The fact that the two curves in Fig. 
4 are so similar is taken as evidence of 
the additivity of the separate effects 
of work and rest in producing their 
joint effects. 


‘i OBSERVED 
DIFFERENCE 


n_ CALCULATED 
DIFFERENCE 


DECREMENT 





ii il | — 
2-3 4 5 67 6 9 10 


MINUTES OF PRACTICE 





Fic. 4. Graphical test of the hypothesis that 
the separate effects of work and rest add to 
produce their joint effect. For the method of 
obtaining the plotted points, see text. 


Discussion 


With one exception, the results of 
this analysis support the previous one 


of Kimble and Bilodeau (3). The 
exception concerns the relative im- 
portance of the work and rest vari- 
ables.. Kimble and Bilodeau argued 
that, of the two variables, work was 
the more important. The present 
results, however, show that this is not 
accurate. Specifically, late in learn- 
ing, a rest difference of 25 sec. was 
shown (Fig. 3) to contribute a greater 
score difference than a 30-sec. differ- 
ence in the length of the practice 
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trial. It is, therefore, probably a 
mistake to say that either work or rest 
is the more important variable. The 
other statements made by Kimble and 
Bilodeau have been shown to apply 
quite well to a new learning task and 
to different amounts of work and rest. 

For example, we have been able to 
confirm the finding that there is no 
interaction between the effects of our 
two independent variables. Jnter- 
action is employed in the statistical 
sense. Referring directly to our two 
variabies, we may say that, if a vari- 
ation in the length of the practice 
trial produces an effect which is inde- 
pendent of the associated amount of 
rest, then there is no interaction. 
Conversely, if a variation in the 
amount of rest allowed between trials 
produces an effect which is independ- 
ent of the associated length of the 
practice trial, then there is no inter- 
action between the effects of work and 
rest. Both Fig. 2 and Table II were 
set up to make this point. In Table 
II, we saw that the score difference 
contributed by a 30-sec. variation in 
the length of the practice trial was 
about the same whether the rest 
period was 5 sec. or 30 sec. In Fig. 
2, we saw that performance differ- 
ences resulting from a 25-sec. vari- 
ation in the length of the rest period 
were similar for both the 30-sec. and 
the 60sec. practice conditions. 
Hence our conclusion that there is no 
interaction between the effects of 
work and rest. 

In the earlier paper, Kimble and 
Bilodeau concluded that the advan- 
tages of a longer rest period increase as 
a function of the amount of practice. 
This conclusion is substantiated by 
the present analysis. Reference to 
Fig. 2 shows this clearly. The shape 
of the function in question now seems 
to be fairly well established as 
S-shaped. 
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Perhaps the question left most in 
doubt in the earlier paper was con- 
cerned with the effect of differences 
in the length of the work period over 
a number of learning trials. The con- 
clusion that this effect was constant 
from the first trial on was made on the 
grounds that it was the simplest 
possible conclusion and that it fit 
well with certain theoretical con- 
siderations. The data, however, did 
not support the conclusion particu- 
larly well. Out present findings show 
that the straight line hypothesis was 
a fairly good guess. The curve of 
Fig. 3 which depicts the trial by trial 
effect of the work difference is es- 
sentially horizontal. Similarly, in 
Table II we note that the variation in 
score differences contributed by the 
work difference is small and that, 
when all of the data are considered, 
there is no consistent trend. 

This leads us finally to the hypoth- 
esis advanced by Kimble and Bilodeau 
that the joint effects of work and rest 
are the simple summation of the two 
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separate effects. This hypothesis was 
retested in this paper in Fig. 4, where 
the correspondence between the cal- 
culated and observed differences was 
seen to be quite close. In other 
words, given a knowledge of the 
separate effects of a particular vari- 
ation in the length of the work and 
rest periods, we can predict the joint 
effect producible by a simultan | 1s 
variation of both factors by — \e 
amounts. It will be the sum of -uc 
two separate effects. 


(Manuscript received April 5, 1948) 
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CORRECTION VS. NON-CORRECTION LEARNING 
TECHNIQUES AS':RELATED TO REMINISCENCE 
IN SERIAL ANTICIPATION LEARNING 
BY CLAUDE E. BUXTON 
Northwestern University 


AND 


MILDRED B. BAKAN 
* Ohio State University 


Each report in this series of three is 
concerned with one of the variables 
which has not been controlled in a 
uniform way in the principal studies 
of reminiscence; the intent is to check 
systematically upon whether certain 
minor variables could account for any 
of the discrepancies among major 
results. For example, Ward (6) re- 
quired learners to spell syllables, 
whereas Hovland’s Ss (3) pronounced 
them; it was found (2) that this factor 
by itself does not produce significantly 
different amounts of reminiscence. 
Also, Melton and Stone (5) controlled 
rehearsal during the rest period by 
requiring rapid color-naming from a 
color-board, whereas Hovland pre- 
sented colors at a two-sec. rate on the 
memory drum; it was found (7) that 
these two rehearsal-preventing tech- 
niques do not produce reliably differ- 
ent recalls, but if in addition Ss are 
instructed to rehearse or not to re- 
hearse while naming colors from the 
drum, the set thus induced leads to 
reliable differences in recall, in the 
expected direction. A third factor 
arises in Hovland’s instruction that 
his Ss correct all errors by reading the 
missed syllable aloud before going on 
the next one, whereas Ward, and 
Melton and Stone, made no specific 
mention of correction techniques. 


The present study attempts to vary : 


this factor systematically, and ob- 


serve whether recall scores after an 
interval of rest are thereby affected. 


PROCEDURE 


Subjects.—The data to be reported here were 
obtained in two different universities, by differ- 
ent investigators, but are combined as though 
this were not the case. The experiment was first 
performed by Bakan (1) at the State University 
of Iowa, but for unknown reasons the perform- 
ance of certain Ss on one condition was atypical 
and seemingly explained only in terms of chance. 
The experiment was therefore extended at North- 
western, by a nearly exact duplication of Bakan’s 
procedure. (There were small differences in 
such respects as the appearance of the drum and 
shutters, illumination, style of type used for the 
syllables, order in which lists were used, etc. 
Since such variations affected the two main ex- 
perimental groups equally, it is assumed they 
may be disregarded.) Such indices as speed of 
original learning, variability of learning rates, 
etc., were examined in the two samples, and it 
was then thought reasonable to combine them to 
make one larger sample. So, two groups of 48 
college students each (referred to below as the 
correction and the non-correction groups), were 
used. Half of each group was drawn from intro- 
ductory psychology classes at each of the two 
universities involved. All were new to psycho- 
logical experimentation. 

Lists.—Syllables were presented on a modified 
Hull type drum at a two-sec. rate. There was a 
‘ready’ symbol two sec. before the first syllable, 
and four sec. between the last syllable and the 
cue symbol. S always learned by pronouncing. 
Lists were prepared for three practice sessions 
and two experimental sessions, each list contain- 
ing 12 typewritten syllables of low association 
value. All lists were selected from the Melton 
materials (4). 

Rest-interval activity —The rest interval was 
two min. long; during this period S named 
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colors. At the side of each syllable tape, 15 
color patches were mounted so that one was 
visible in the drum-aperture at each position of 
the drum when so desired. The same color se- 
quence was used for each tape. Care was taken 
to provide practice at shifting from memorizing 
to color naming and back again, by inserting such 
shifts several times during the practice days. 

Experimental design—Ss were assigned at 
random to either the correction or the non- 
correction groups. In all respects but learning 
technique these two groups were treated alike. 
The instructions with respect to learning tech- 
nique were conventional, except that Ss in the 
correction group were instructed to read aloud 
all syllables which they failed to anticipate cor- 
rectly, whereas Ss in the non-correction group 
were instructed not to correct errors before at- 
tempting to anticipate the next syllable. Dur- 
ing the first two of the three practice days, E 
interrupted S if necessary to make his technique 
conform to the instructions, but such interrup- 
tions were never necessary after this. (AI- 
though, from the point of view of design, it is 
not entirely efficient to employ different groups 
of Ss for the correction and non-correction tech- 
niques, Bakan discovered in a preliminary study 
that a given S could not readily set himself to 
reverse techniques on successive days. To do 
the experiment at all, therefore, two separate 
groups must be used.) 

During the two experimental sessions, each S 
in each group served in an experimental (two 
min. rest) and control (no-rest) condition. The 
pre-rest criterion of mastery was 7 out of 12 
syllables. In the no-rest condition, the trial on 
which S first achieved 7/12 was treated as the 
criterion trial, and the next trial as the recall. 
Orders of rest and no-rest conditions, and lists, 
were counterbalanced systematically. 

When he had finished the experiment, each S 
was questioned concerning the amount of re- 
hearsal during color-naming, and, in the non- 
correction group, the degree of conformity to 
instructions not to correct errors. This was 
thought necessary because, although no S cor- 
rected himself aloud, there was opportunity for 
silent self-correction. 


REsuLTS 


Recall data—A summary of cri- 
terion trial and recall trial data, for 
all conditions, is presented in Table I. 
Initial mastery for all conditions is 
fairly constant, but inspection sug- 
gests that at recall the rest and no-rest 
conditions are different. The indica- 
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TABLE I 


Means or Crirerion Triat Scores Anp 
Recaty Triat Scores, anv Tuerr SD’s 
(N in each group = 48) 








Non-Correction 





Rest 





M (cri.) 7-71 
SD 86 
6.17 


M (rec.) 
SD 1.75 

















tion is one in the direction of reminis- 
cence when the correction technique 
has been employed, but forgetting 
when the non-correction technique 
has been employed. 

To evaluate the tendencies in Table 
I properly, analysis of covariance may 
be used; we may thus allow for such 
differences as did exist in level of 
mastery before rest. ‘To put the data 
in a form suitable for this type of 
analysis the following steps were 
taken: first, the difference between 
criterion trial scores for rest and no- 
rest conditions was secured for each 
S (with proper algebraic sign, this 
indicates any failure of matching in 
level of mastery before rest). Sec- 
ond, the difference between recall 
trial scores for rest and no-rest condi- 
tions was secured for each S (with 
proper algebraic sign, this indicates 
the effect of the rest period). These 
two difference scores were used as the 
X and the Y terms respectively, in 
the analysis of covariance. (Variance 
attributable to correlation between 
scores for an individual is removed 
by the computations just described.) 

The adjusted (Y) ‘difference score’ 
mean for the correction group is +.454 
(showing that recall after rest was 
superior to immediate recall), and for 
the non-correction group is —.975 
(showing that forgetting occurred in 
the rest condition). These tendencies 
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are significantly different (F = 7.18; 
.O1 level = 6.92, for 93 d.f.). 

Accessory data.—Difficulty of learn- 
ing to the criterion does not indicate, 
in these data, anything about the level 
of recall to be expected. The mean 
numbers of trials to the criterion, with 
their SD’s in parentheses, were as 
follows: correction-rest, 9.04 (4.12); 
correction-no-rest, 9.13 (4.06); non- 
correction-rest, 8.79 (4.21); non-cor- 
rection-no-rest, 8.71 (4.01). NW = 48 
for each mean. As inspection indi- 
cates, none of the differences among 
these means even approaches sig- 
nificance when tested by analysis of 
variance. 

As shown by analysis of co-variance 
in trials to 12/12 after the criterion 
was first reached (comparable to the 
analysis of recall data), in spite of the 
recall trial differences, correction and 
non-correction learners did not re- 
spond differentially to rest. 

Questioning of Ss concerning re- 


hearsal during rest, after they finished 
the experiment, revealed no explana- 
tion of the differential effects of inter- 


polated rest. That is, nearly half the 
Ss admitted small amounts of re- 
hearsal or fleeting rehearsal during 
color naming, and a half-dozen ad- 
mitted considerable rehearsal, but 
these Ss were almost exactly equally 
divided between the correction and 
non-correction groups. 

A more significant result of the 
questioning was that, of the 48 Ss 
in the non-correction group, 27 ad- 
mitted that they failed quite often to 
follow instructions, i.e., they corrected 
themselves silently even though they 
did not do so overtly, and another 11 
admitted to occasional self-correction. 

Interpretation On the assumption 
that the recall differential here re- 
ported is a genuine one (and the two 


sub-experiments for the two universi- , 
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ties showed it separately), a difficult 
problem of interpretation is raised. 
Since rates of learning, criterion-trial 
scores and reported amounts of re- 
hearsal were quite comparable in the 
correction and non-correction groups, 
there is no objective evidence as to 
why the non-correction group is in- 
ferior at recall. 

Furthermore, the results do not 
align themselves with those of previ- 
ous investigators. For example, 
Ward, who used the non-correction 
technique, found reminiscence, and 
so did Hovland who required cor- 
rection. While it is also true that 
Ward’s Ss spent the rest interval at 
light reading, with possibilities for 
rehearsal, while Hovland’s Ss used 
the color naming technique employed 
here, it does not seem reasonable to 
explain the present results in terms of 
the possibilities of rehearsal in Ward’s 
Ss. 

Another point deserves special men- 
tion. This is the mean loss at recall 
in the rest condition with the non- 
correction group. As suggested by 
its SD, this mean is particularly 
affected by Ss (from both universi- 
ties) whose performance deteriorated 
markedly after color-naming. This 
possible interaction between learning 
technique (non-correcting, anticipat- 
ing) and interval-filling technique 
(naming only) warrants further study. 

To return to the data in general, it 
might be suggested that the habit of 
self-correction is an old, well-estab- 
lished one, and that it is extinguished 
to some degree during learning under 
the non-correction instructions. It 
might then recover sufficiently during 
a rest period to interfere with set on 
the crucial recall trial. But if this 
were the case, one might expect the 
original learning to the criterion to 
have taken somewhat longer under 
non-correction than under correction 
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instructions. (If the extinction of 
the correction set took only a few 
trials, this would not necessarily 
happen.) 

Another possibility is that, since 
S must be able to correct any error 
and also anticipate the next item, 
with the correction technique, a 
slightly higher level of mastery is 
actually demanded to reach the same 
criterion of 7/12. Retention might, 
therefore, be better with the correc- 
tion technique. This suggestion 
would be more convincing, however, 
had there been a larger difference in 
trials to the criterion, in the obtained 
direction. 


SUMMARY 


1. Two groups of 48 Ss learned 
12-item syllable lists by the serial 
anticipation method. One _ group 
learned by the correction method, the 
other by non-correction. Within 
each group, each S served in a rest 
condition (two min. color naming 


from the drum, interpolated after a 
criterion of 7/12 was reached) and a 
no-rest condition. 

2. A majority of Ss in both the 
correction and no-correction groups 
admitted some rehearsal during the 


rest interval. More important, half 
the Ss in the non-correction group 
admitted that they corrected them- 
selves silently at least part of the time. 
It thus is entirely possible that the 
seemingly different instructions of 
Ward and Hovland on this point did 
not produce any differences in the 
learners. 

3. There was a tendency for remi- 
niscence to appear at recall in the cor- 
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rection rest condition, and clearer 
evidence of forgetting in the non-cor- 
rection rest condition. These two 
results taken together lead toa reliable 
difference in the measured effects of 
the learning technique. The for- 
getting in the non-correction condition 
was unexpected; since this condition 
employed Ward’s learning technique 
but Hoviand’s rest activity, there are 
no data with which to compare the 
present results. The possibility of 
interaction needs to be investigated. 


(Manuscript received July 7, 1948) 
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TRANSFER TO A MOTOR SKILL FROM PRACTICE 
ON A PICTURED REPRESENTATION ! 


BY R. M. GAGNE AND HARRIET FOSTER 
Connecticut College 


INTRODUCTION 


It has seemed evident to many in- 
vestigators of the learning of motor 
skills that the difficulty of motor tasks 
is often to a large extent determined 
by the complexity of the stimulus 
situation to which the subject must 
react, rather than by the demands 
made on his motor capacities. In 
many practical situations, the oper- 
ator of a mechanical or electronic 
device must make relatively simple 
directional movements to switches or 
knobs, in response to a stimulus situ- 
ation which varies simultaneously 
along a number of dimensions. In 
the laboratory, too, the kind of task 
set for the subject in a motor learning 
experiment is often one which in- 


volves relatively simple, well-prac- 
ticed responses, such as placing cards 
in bins, pressing keys, or tracing 


paths. At the same time, such tasks 
may be made difficult to learn be- 
cause of the number and complexity 
of stimuli to which the reaction must 
be made. Because of these facts, it 
is frequently said that the learning of 
motor tasks is largely a matter of 
learning perceptual relationships; the 
‘perceptual aspect’ of the task is the 
thing which has the greatest effect on 
the learning of the motor skill. 

The problem of the present investi- 
gation has been to determine the 
, extent to which positive transfer to 


1 This article is Report 316-1-4 under Con- 
tract N7onr-316, Task Order I, between the 
Special Devices Center, Office of Naval Re- 
search, and Connecticut College. Research is 
carried out under this contract at the U. S. 
Naval Medical Research Laboratory, U.S. Naval 
Submarine Base, New London, Connecticut. 


the learning of a motor skill can re- 
sult from varying amounts of practice 
on a paper-and-pencil representation 
of the skill. The paper-and-pencil 
task employed was one which re- 
quired the learning of the same stim- 
ulus-response relationships as those 
of the motor task. The stimuli were 
‘pictures’ of the actual stimuli; the 
responses of marking X’s in printed 
spaces were made in the same relative 
spatial positions as those of the motor 
reaction panel. 

The motor skill on which the sub- 
jects were trained was a relatively 
simple one so far as the responses 
were concerned. The reaction re- 
quired of the subject was that of 
moving his hand in one of four di- 
rections from a starting point near 
his body, and momentarily pressing 
a switch located at each of four posi- 
tions. The learning of this skill 
appears to involve the acquisition of 
discriminative responses to the four 
stimuli, and to be influenced by the 
degree of generalization which exists 
between them (4). The attempt was 
made in the present investigation to 
discover to what extent this learning 
could be facilitated by different 
amounts of practice on the pictured 
task. 

A review of the literature has re- 
vealed no studies in which the prob- 
lem has been stated exactly in these 
terms. Related experiments are 
those in which subjects have learned 
motor tasks in some degree, either 
by direct observation or demonstra- 
tion of the actual task, as in the 
studies of Bray (1), Twitmyer (12), 
and Siipola (11); or by observation of 
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motion pictures, as in the studies of 
Lockhart (6) and May (7). In ad- 
dition to these, various other kinds 
of verbal and graphical guidance were 
found to have considerable effective- 
ness for learning in the experiments of 
Carr (2). Another related kind of 
investigation is exemplified by studies 
of the effect of ‘mental practice’ on 
the learning of a motor skill. Again 
in these cases, the procedure was one 
in which the subjects observed the 
actual task, rather than a representa- 
tion of it, and attempted to carry out 
imagined practice of the required 
responses. The general finding of 
these studies by Sackett (10), Eggles- 
ton (3), Perry (9), and Vandell, 
Davis, and Clugston (13), has been 
that mental practice exhibits con- 
siderable positive transfer to the 
learning of motor skills. 

Practice on the paper-and-pencil 
task which represented the motor 
skill of the present experiment differs 
from the practice situation in studies 
of the effectiveness of observation 
and mental practice in several im- 
portant respects. In the first place, 
the conditions of the mental practice 
experiment do not involve control of 
the actual type and number of re- 
sponses which the subject makes, 
whereas in the present experiment, 
both of these are specified. Sec- 
ondly, the observational situation usu- 
ally presents the subject with a con- 
tinuing view of the apparatus itself, 
whereas in the present study, the 
stimulus situation is only a pictured 
representation of the motor task. 
The pictured stimuli employed re- 
semble the actual stimuli in shape 
and approximately in color, but 
differ from them in a number of re- 
spects, including absolute size and 
spatial position. It is to be noted, 
however, that the relative location of 
the stimuli and the response blanks 
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in the pictured task is essentially the 
same as the relative location of the 
light stimuli and switches in the motor 
task. In both cases, an upper light 
is to be associated with a response to 
the right, and a lower light with a re- 
sponse to the left; a red light is to be 
associated with switch No. 1 (the left 
of a pair), and a green light with 
switch No. 2 (the right of a pair). 
The actual responses of the two tasks 
are markedly different in other re- 
spects. The motor task requires di- 
rectional movements of the arm in an 
arc approximately two feet in front 
of the body; the preliminary task 
involves small hand and finger move- 
ments in the same relative directions, 
but in a limited space. 

If the stimulus situation of the 
paper-and-pencil task is sufficiently 
similar to that of the motor task, it 
may be expected that the effect of 
increasing the amount of practice on 
this preliminary task will be to reduce 
the tendency to generalize which 
exists between the stimuli of the 
motor task, since this has been shown 
to occur as a result of direct practice 
on a task which includes the stimuli 
of the motor task itself (4). In other 
words, giving preliminary practice on 
the paper-and-pencil task should be 
one way of pre-differentiating the 
stimuli involved in the motor task, so 
that a smaller amount of interference 
would be expected following such 
training. Such reduction of general- 
ization is one factor which we assume 
to be operating in the present experi- 
ment. 

The effect of a second factor is more 
difficult to predict. This is the in- 
fluence of response similarity on the 
degree of transfer to be expected. 
It is evident that despite the fact of 
resemblance between the directions 
of responses in the two situations, 
we seem to be dealing with a some- 
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what derived example of the associ- 
ation of a new response (pressing a 
switch instead of making a pencil 
mark) with an old stimulus (position 
and color of circular objects), a 
situation which is well-known for its 
tendency to result in negative trans- 
fer. It is probable, therefore, that 
some tendency for the exhibition of 
negative transfer should be expected 
in our experiment. Previous studies 
which have measured negative trans- 
fer (cf. 8) have indicated it to undergo 
a reduction with increasing amounts 
of practice on the preliminary (inter- 
fering) task. 


MeETHOD 


Subjects—The subjects used in the experi- 
ment were 145 young Navy enlisted men who 
were taking the physical examination for entry 
into Submarine School. These men formed a 
relatively homogeneous sample, since they had 
been selected on the basis of a number of quali- 
fications, including those of having volunteered 
for submarine duty, of being above average in 
intelligence as measured by Navy tests, of meet- 
ing certain minimum physical requirements, and 
of displaying no evidence of emotional instability. 
They were selected at random for use as subjects 
in the experiment. 

The men were divided into five matched 
groups according to two criteria: (1) a learning 
score obtained from the administration of the 
full-page Woodworth-Wells digit-symbol test 
with a time limit of 90 sec.; and (2) a measure 
of reaction time obtained in 25 trials of reaction 
to a single green light of the apparatus by press- 
ing a single switch. The multiple correlation 
between these two measures and the average re- 
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sponse time score during 60 trials in the control 
group was .424. The means and standard devi- 
ations of these scores in the five groups are shown 
in Table I. 

It can be seen that the five groups are quite 
adequately matched. The value of mean reac- 
tion time for the control group is slightly lower 
than that of the other groups, but this difference 
is opposite in direction to those obtained be- 
tween groups as a result of the experimental 
conditions. 

Apparatus.—The apparatus for the motor 
task has been described in a previous article (4). 
Briefly, the task requires the subject to react to 
one of four lights by pressing an appropriate 
switch on a reaction panel. The display panel 
contains two sets of red and green lights, one set 
7} in. above the other. The reaction panel con- 
tains a starting button, on which the subject’s 
thumb rests before each reaction, and four spring- 
return toggle switches, two on the left and two 
on the right of the center, each set numbered 1 
and 2, respectively, beginning from the left. 
The experimenter turns on any one of the four 
lights and at the same time automatically starts 
an electric timer. The subject responds by 
pressing momentarily the correct switch as de- 
fined to him by the instructions. The pressing 
of the proper switch turns out the light and stops 
the timer. If an incorrect switch is pressed, the 
light continues to glow and the timer to run until 
the subject makes the correct reaction. The 
experimenter reads aloud the time, records it, 
and resets the timer between each trial. 

Procedure.—At the beginning of the session, 
the subject was given the full-page Woodworth- 
Wells digit-symbol test with a time limit of go 
sec. Following this, the subject was given 0, 8, 
16, 24, or 48 trials on a paper-and-pencil task, 
a sample page of which is depicted in Fig. 1. 
Each page of the booklet represented a single 
stimulus situation in which one of the four circles 
was colored either red or green, corresponding 
with the color of the panel light which it repre- 


TABLE I 
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Group 
“gr 
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Group 
16 ‘24 


43" 





Substitution Test— 
number right 
Mean 


Reaction Time in 
hundredths of seconds 
Mean 


SD 








63.31 
9.78 


63.24 








42.52 
7:3 
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When the subject had completed the paper- 
and-pencil task, he practiced the final motor task 
for 60 trials. The instructions were as follows: 


This apparatus measures your reaction 
time. Using your preferred hand, always 
start with your thumb resting on this button 
(demonstrate). You are to react in the same 
way as on the paper-and-pencil test, but you 
will react to a light instead of a colored circle, 

OG and by pressing a switch instead of marking 

8% ' an X ina a A lower light means a 
Y oP switch on the left, an upper light means a 
switch on the right. A red light means switch 
No. 1, and a green light means switch No. 2. 

Fic. 1. Sample ‘page of the paper-and- Return your hand to the starting button after 
pencil task. From the top, the colors appearing each reaction. Try to make as fasta reaction 
in the circles were red, green, green, red. Top as you can, without moving in the wrong direc- 
green is shown, requiring the response to be made tion. Your score on each trial can be seen on 
to rectangle 2 on the right. this clock, which reads in hundredths of a 
second. If you make the wrong reaction, the 
clock will continue to go, so you should then 
sented. The subject was to mark an X in one press the correct switch as rapidly as possible. 
of four rectangles at the bottom of the page. I will read off your score each time, so that you 
The following instructions were given: can keep track of your own improvement. 

This is a paper-and-pencil test which repre- All ready to begin? Ready! 

sents this apparatus. [Point to motor skill After a correct response was made by the sub- 
apparatus.] The circles [point to circles] ject, the experimenter read aloud and recorded 
represent the lights on the panel [point to the reading of the timer in hundredths of a 
lights]. The circle which is colored indicates second. The number of incorrect switches (if 
which light is on. The rectangles [point to any) pressed by the subject during the trial was 
rectangles] represent switches [point to also recorded. The time interval between trials 
switches] to be pressed. There are four rec- was maintained at the approximately constant 
tangles, one for each circle. If a lower circle value of 10 sec. during the final training of the 
is colored, you should react by marking an X groups. The time elapsing between the end of 
in a rectangle on the left. If an upper circle the paper-and-pencil training and the beginning 
is colored, you should mark an X inarectangle of practice on the final task was the length of 
to the right. A red circle means rectangle time required to seat the subject and to give the 
No. 7, a green circle means rectangle No. 2. instructions for the final task. 
A lower red circle, then, means rectangle No. 1 Upon completion of training in the final task, 
on the left; a lower green circle means rec- the reaction time of each subject was measured 
tangle No. 2 on the left; an upper red circle in 25 trials of reaction to switch No. 2 on the left, 
means rectangle No. 1 on the right; and an _ on the appearance of the lower green light. This 
upper green circle means rectangle No. 2 on _ measure of reaction time was used in the match- 
the right. Work as rapidly and as accurately _ ing of groups. 

as you can, marking an X in the correct one 

of the four rectangles. As soon as you have 

marked your X, turn to the next page and 
mark an X in the correct rectangle there. 

Are there any questions? Ready—go ahead. 


be ee ee ee ee me 











RESULTS 


Learning and transfer —The means 
; : , and standard deviations of response 
The total time required to complete the given 


; ; times in successive sets of ten trials 
number of trials was recorded, together with the f ‘ h k. f 
number of errors. If more than half the ree ©! Practice on the motor task, for 
sponses made by a subject were errors, the data each of the five groups, are given in 
were discarded, since this indicated that he had Table II. The values of the means 
— the — a a his at the beginning of learning are 
initial practice. e subjects whose data were closely similar, and no significant 
employed for analysis made an average error : 
score of about 6 per cent on the paper-and- differences appear between the scores 


pencil task. of the various groups. On succeed- 
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TABLE II 
Tie or Response Per Triar (1n 1/100 Sec.) 1n Successive 10-TRIAL STAGES 


oF LEARNING THE Motor Task. 


N = 290 








Trials 





I-10 


21-30 41-50 51-60 





112.28 
72.40 


111.96 
69.31 


115.84 
75-05 
108.50 
74-50 
113.56 
79-97 














97-37 90.59 84.37 
60.04 §2.30 51.02 


88.81 85.63 82.96 
40.19 43-97 49.33 


91.76 . 82.71 76.23 
54-59 52.68 42-33 


86.58 81.53 
48.50 44.98 


83.68 ‘ 78.21 85.49 
40.82 40.31 54-13 


81.95 
46.22 














ing trials, however, relatively large 
differences occur between the time 
score values of the various groups, 
indicating differences in rate of learn- 
ing. By the time the stage of final 
learning denoted by trials 21-30 is 
reached, all the experimental groups 
have time score values below that of 
the control group. This continues 
to be the case until the final set of 10 
trials is reached, when the differences 
become less. Additional practice 
would evidently be needed to deter- 
mine whether the curves are approach- 
ing the same or different asymptotic 
values. 

The significance of these differ- 
ences was tested by the use of the 
formula which applies to correlated 
means. Since the scores on each 
trial of the 1o-trial sets were em- 
ployed in obtaining the SD’s., these 
values are larger than they would be 
if the systematic variance of learning 
were removed, and the estimates of 
significance obtained are somewhat 
conservative. On trials 21-30, dif- 
ferences significant at the .o1 level 
are found between the values for the 
control group and Groups 24 and 48; 


at the .o5 level, between the control 
group and Group 8. On trials 31-40 
and 41-50, significant differences at 
the level of .o5 or less occur between 
the control group and each experi- 
mental group except Group 8. Sig- 
nificant differences between the scores 
of the various experimental groups 
may be summarized as follows: On 
trials 21-30, between Groups 8 and 
48, 16 and 48; trials 31-40, Groups 8 
and 16, 8 and 24, 8 and 48; trials 
41-50, Groups 8 and 48. 

In order to display clearly the 
effects of various amounts of paper- 
and-pencil practice on the learning of 
the final task, curves depicting the 
percent of total learning accomplished 
by each group are shown in Fig. 2. 
These curves were obtained by sub- 
stituting the values of Table II in the 
formula: 


Percent Transfer 


___ C Group Score-T Group Score 
C Group Score-Total Possible Score 





The estimated measure of ‘total learn- 
ing used was the score attained by 
Group 16 on trials 51-60 or 76.23 
hundredths of a sec. The curves of 
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Fic. 2. Percent learning accomplished at 
six stages of practice on the motor task by each 
of five groups with different numbers of prelimi- 
nary paper-and-pencil training trials, 


Fig. 2 indicate that although the pre- 
liminary training on the paper-and- 
pencil task appears to accomplish 
nothing on trials 1-10 of the final 
motor learning, succeeding practice 


on the final task reveals the effect of 
considerable positive transfer. On 
trials 31-40 of the final learning, for 
example, the control group has been 
able to bring the task to slightly over 
40 percent of (estimated) mastery. 
The percent of total learning ac- 
complished by each of the experi- 
mental groups is greater than this, 
being at least 50 percent in the group 
which has had 8 trials of paper-and- 
pencil practice, and over 80 percent 
in the groups which have had 16, 24, 
and 48 trials of such practice. 

The results shown in Fig. 2 also 
indicate the decreasing amount of 
transfer effect contributed by suc- 
cessive additional units of prelimin- 
ary practice on the paper-and-pencil 
task. Throughout the course of final 
learning, the greatest differences be- 
tween the scores of the experimental 
groups occur in general between 
Groups 8 and 16. The differences 
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between the learning of Group 16 and 
those groups which had 24 and 48 
trials of preliminary practice are 
particularly evident in the early 
stages of the learning, but shrink to 
insignificance after the 3oth trial. 
Thus, although 16 trials of prelimin- 
ary training contributed more than 
did 8 trials of such training to the 
learning of the motor task, additional 
amounts of paper-and-pencil training 
(24 and 48 trials) did not bring about 
very marked additional improvement 
in the final learning. 

Errors—The means and standard 
deviations of the number of errors 
occurring during successive 10-trial 
intervals of the final learning in each 
of the five matched groups of sub- 
jects are given in Table III. The 
final column presents the sum of 
these average values, or the average 
number of errors occurring throughout 
60 final learning trials. Many of the 
differences in error scores for the 
various groups are significant at the 
.0§ level or less. This is true for a 
comparison of the scores of the con- 
trol group with the 24-trial group 
throughout learning; with the 48- 
trial group, on all trials except 51-60; 
with the 16-trial group, on trials 21-40, 
51-60; and with the 8-trial group, on 
trials 21-30 and 41-So. Significant 
differences among the scores of the 
experimental groups are found be- 
tween Groups 8 and 16 on trials 31- 
40; Groups 8 and 24 on trials 11-20, 
31-40; Groups 16 and 48 on trials 
11-20. 

A graph of the relation between the 
total error values and the number of 
trials of preliminary paper-and-pencil 
practice is shown in Fig. 3. It is 
evident from this figure that eight 
trials of predifferentiating practice 
bring about a sharp reduction in 
errors, in comparison with the number 
occurring in the control group. Fur- 
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TABLE III 


Errors 1n Successive 10-Triat StaGes oF LEARNING THE 


Motor Task, 


AND ToTat Errors 1n 60 Triats. N = 29 








Total 


31-40 Errors 





Mean 
SD 


Mean 
SD 2.6 


Mean 


SD 2.3 


24 


Mean 
SD 


1.6 


Mean 
SD 














2.5 


3-21 


2.62 


2.48 


27-55 
16.3 


20.11 
12.7 


19.59 
9-7 
16.52 
7:7 


16.44 
10.4 

















ther reductions of descreasing magni- 
tude result from increasing the amount 
of preliminary practice on the paper- 
and-pencil task. These results do not 
reveal any tendency for generalization, 
as measured by errors on the final 
task, to rise to a maximum and then 
decrease as the amount of prediffer- 
entiating practice is increased, as E. J. 
Gibson’s (5) analysis of paired associate 
learning would suggest. The reason 
for this may be that the amount of pre- 
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Fic. 3. Total errors made during the learn- 
ing of the motor task, and number of trials to 
reach 70 percent learning, both as a function of 
amount of preliminary practice on the paper- 
and-pencil task 


differentiation accomplished by eight 
trials was sufficient in this situation to 
permit the point of maximum generali- 
zation to be passed during the 8-trial 
practice period itself. Testing this 
hypothesis would obviously involve 
measuring the effects on learning of 
an even smaller amount of paper-and- 
pencil practice. However, such an 
interpretation evidently does not con- 
flict with the findings of a previous 
experiment (4) in which errors were 
shown to increase to a maximum and 
then decrease as a result of increasing 
the amount of predifferentiating prac- 
tice. In that study, the preliminary 
training consisted of practice on the 
same motor skill apparatus, with the 
use of only two of the four stimuli; 
the subject practiced only one of the 
two types of discrimination required 
in the final task. Such practice was 
found to have a positive transfer 
effect on the learning of the total 
skill, provided sufficient amounts of it 
were given. Whereas 10 trials of 
preliminary part-practice resulted in 
no transfer and an increase in the 
number of errors occurring on the final 
task, 30 and 50 preliminary trials were 
found to yield positive transfer and a 
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progressive reduction in errors. It is 
evident that one of the major differ- 
ences between such predifferentiation 
and that of the present experiment 
was the fact that in the present case 
the total task, including all four 
stimuli, was represented in prelimin- 
ary training. The results imply that 
a training situation which represents 
the whole task, even in pictured form, 
has an even greater effectiveness than 
the same amount of training on a part 
of the actual motor task. 

Fig. 3 also provides a comparison 
between the reduction in errors re- 
sulting from increasing amounts of 
preliminary practice on the paper-and- 
pencil task, and the rate of learning 
the final task as it is affected by this 
same kind of practice. The measure 


of rate of learning employed is num- 
ber of trials to reach the criterion of 
70 percent learning, and has been 
arrived at by taking values from an 
abscissa drawn at this point through 


the graphs shown in Fig. 2. This 
particular criterion was chosen simply 
because it seemed to give values 
which were typical of the learning 
exhibited in these curves. If smooth 
curves were to be drawn through the 
points, an essentially similar rela- 
tionship would be obtained. The 
comparison shown in Fig. 3 indicates 
the close similarity between the in- 
crease in rate of learning and the re- 
duction in number of total errors 
occurring during the final learning. 
These results imply that the effect of 
continued preliminary practice on the 
rate of learning of the final task comes 
about largely because of the reduction 
in generalization tendencies which 
results from such practice. More- 
over, this effect has resulted even 
when the responses required of the 
subject in the preliminary training 
were somewhat different from those of 
the final learning. 
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The tendency for errors on the 
final task to pass through a maximum 
and subsequently decrease as practice 
continues is shown in Fig. 4, in which 
are plotted the average number of 
errors in successive sets of Io trials of 
final learning, in each of the five 
groups of theexperiment. This figure 
shows the tendency for errors to re- 
main relatively high in the control 
group until after the 4oth trial of 
final learning. A similar tendency 
may be seen in the errors of the group 
which had eight trials of paper-and- 
pencil practice, though the total 
number of errors is considerably 
lower than that occurring in the 
control group. In those groups hav- 
ing 16 and 24 trials of predifferentiat- 
ing practice, this point of maximum 
error occurrence appears nearer the 
beginning of final learning, and is 
passed after the 2oth trial. In the 
group having 48 predifferentiating 
trials, the maximum number of errors 
apparently occurs during the interval 
of trials 1-10. Thus, although the 
regularity of this tendency is by no 
means complete, it would seem that 
the effect of preliminary paper-and- 
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Fic. 4. Number of errors made at six stages 
of practice on the motor task, by each of the 
five groups, with different amounts of prelimi- 
nary paper-and-pencil training 
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pencil training has been in general to 
push the point of maximum error 
occurrence nearer and nearer the be- 
ginning of final learning as the amount 
of predifferentiation is increased. In 
a previous paper (4), the hypothesis 
was suggested that this effect resulted 
from a reduction in generalization 
between the stimulus members of the 
total task as a result of preliminary 
training. 

Correlations between preliminary and 
final practice.—Coefficients of correla- 
tion were obtained between the times 
required for completing the paper-and 
-pencil task and the average response 
time for 60 trials of practice on the 
motor task. For the 8-trial group, 
the data are N = 35, rf = .13 + .II; 
for the 16-trial group, N = 36, r = 
.12 + .11; for the 24-trial group, NV 
= 35, r = .67 + .06; and for the 48- 
trial group, N = 33, r = .47 + .og. 
It will be noted that, at least in the 
groups having the larger numbers of 
trials on the paper-and-pencil task, 
these coefficients indicate a consider- 
able degree of relationship between 
this task and the motor skill. This 
amounts to an indication that the 
perceptual ability measured by the 
score on the paper-and-pencil task 
has a good deal in common with the 
kind of ability displayed in learning 
to perform the motor task. 


Discussion 


The results of the present experi- 


ment indicate that a considerable 
degree of positive transfer results from 
practice on a paper-and-pencil repre- 
sentation of a motor skill. The 
amount of such transfer has been 
found to increase as the number of 
trials of practice on the paper-and- 
pencil task is increased from 8 to 48. 
The largest degree of increase seems 
to occur, in this situation, with 
amounts of preliminary practice up 
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to 16 trails; thereafter, the increase 
in amount of transfer is not great. 
Thus, the effectiveness of continued 
practice on a ‘pictured’ task may be 
seen to have a limit of practical 
usefulness. 

One important point which is made 
by these results concerns the way in 
which transfer makes itself apparent 
during the course of learning the final 
motor task. On the first few trials 
of this learning, no evidence of 
transfer is found. As practice on the 
final task is continued, however, sig- 
nificant differences in the rate of 
learning are found between the groups 
having different amounts of prelimi- 
nary practice on the paper-and-pencil 
task. These results emphasize the 
necessity for measuring the learning 
of the final task throughout a rea- 
sonably large number of practice 
trials, in order to obtain an estimate 
of the transfer effectiveness of pre- 
liminary learning. Had we, in the 
present experiment, limited our meas- 
urement of transfer to the first 10 
trials of learning of the final task, the 
results would have indicated no 
transfer from any of the conditions of 
preliminary practice which were used. 
Instead, the results indicate the pres- 
ence of both interfering and facilitat- 
ing effects, as is undoubtedly the case 
in a great many experiments on 
transfer. It appears that the inter- 
fering effects of the preliminary prac- 
tice were sufficient in this case to 
obscure the facilitating effects on the 
first few trials of practice on the final 
task. However, when practice was 
continued to later stages, the rela- 
tively large degrees of facilitation 
ultimately revealed themselves, at 
least from the 2oth trial on. 

The results seem to indicate that 
the increased rate of learning resulting 
from increased amounts of practice on 
the paper-and-pencil task comes about 
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largely because of the reduction in the 
generalization tendencies of the stim- 
uli, as measured by number of overt 
errors. It will be recalled that in the 
present task the time scores which 
are used to measure the learning are 
inevitably increased in value when 
errors occur; consequently, the learn- 
ing scores are not independent of the 
occurrence of errors. However, the 
number of such errors is altogether 
insufficient to account directly for the 
differences in the average response 
time values. Rather, it is proposed 
that the time values reflect the sub- 
ject’s confusions and hesitations which 
are themselves to be considered evi- 
dences of interference. The _ re- 


duction in number of errors following 
increased amounts of practice on the 
paper-and-pencil task is closely cor- 
related with the progressive reduc- 
tion in number of trials needed to 
reach a criterion of learning. This 
implies that the chief factor responsi- 


ble for an increased rate of learning is 
the reduction in the internal gen- 
eralization of the task. 

The results of a previous study (4) 
and an analysis of paired-associate 
learning (5) suggest that positive 
transfer in the present situation is the 
result of such a reduction in general- 
ization, brought about by prelimi- 
nary practice which serves the function 
of predifferentiation. The applica- 
tion of this hypothesis to the present 
situation, however, obviously involves 
a number of additional assumptions. 
In the first place, the stimuli of the 
paper-and-pencil task were not identi- 
cal with those of the final task, but 
only similar in the sense that they 
were a ‘pictured’ representation. 
Nevertheless, the results suggest that 
practice involving these stimuli ac- 
complished a predifferentiation of the 
same general type of stimuli employed 
in the final motor task. In the sec- 
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ond place, although the responses of 
the preliminary and final tasks were 
different in the sense that one re- 
quired marking an X in a printed 
rectangle, whereas another required 
the pushing of a switch, they also 
resembled each other in at least one 
respect—that of relative spatial posi- 
tion. Both in the paper-and-pencil 
task and in the motor task, reactions 
had to be made to positions 1 and 2 
on the left of center and positions 1 
and 2 on the right of center. The 
degree to which this ‘relational’ type 
of similarity was influential in effect- 
ing the positive transfer which was 
found cannot be determined from the 
present results. 

It seems likely that the interfer- 
ence found on the first few trials of 
final task learning resulted chiefly 
from the fact of the difference between 
the responses practiced in the pre- 
liminary training and the responses 
required in performing the final motor 
task. Negative transfer is what 
would be expected in this situation on 
the basis of previous findings which 
have been summarized in the state- 
ment that interference results when 
the learner has to associate a new 
response with an old stimulus. An 
exact determination of the negative 
and positive transfer effects of vari- 
ation in the stimulus and response 
aspects of the task would require two 
additional experiments designed some- 
what differently from the present one. 
One would need to have a preliminary 
task which involved the presentation 
of the identical stimuli used in the 
final motor task, but which required 
responses which were dissimilar in 
all respects to those of the final task. 
A second experiment would employ 
exactly the same responses in the 
two tasks, but the stimuli of the pre- 
liminary task would be only similar 
to those of the final task. It is our 
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guess that the first of these experi- 
ments would yield results not very 
different from those which we have 
obtained in the present study, namely, 
learning curves which show evidence 
of interference during the _ initial 
portion and positive transfer as learn- 
ing is continued. On the other hand, 
our hypothesis with regard to stim- 
ulus generalization would lead us to 
expect positive transfer from the sec- 
ond experiment, increasing in amount 
as the number of trials of preliminary 
practice were increased. 

In a previous study (4), we were 
interested in investigating the effect 
of direct practice on a part of the same 
motor task which was used in final 
training. Subjects were given differ- 
ent amounts of preliminary training 
on two of the four switches, requiring 
them to practice one of the two dis- 
criminations involved in the total 
task. When a sufficient amount of 
such training was given, it was found 
to transfer positively to the learning 
of the total task. In the present 
situation, we have given subjects 
practice on a pictured representation 
of the total task. It is evident that 
this kind of training has been in 
general more effective than the part 
training on the actual task used in the 
previous experiment. The present 
results show that 16 trials of practice 
on the paper-and-pencil representa- 
tion of the task accomplished ap- 
proximately the amount of transfer 
effected by 30 to 50 trials of part- 
training on the actual motor task, if 
we compare the level of performance 
reached by the 30th trial of practice 
on the final task. (A more exact 
quantitative statement cannot be 
made because the groups in the two 
experiments were only approximately 
matched with each other.) This in- 
dicates the relatively high degree of 
effectiveness of training on a pictured 


representation of the whole task. In 
considering the generality of these 
results, one important factor has to 
be borne in mind. This concerns the 
complexity of the’ motor task used in 
this experiment as influenced by the 
number of differential responses re- 
quired of the subject. The differen- 
tial effectiveness of part training as 
compared with whole training on the 
pictured task may be largely de- 
pendent upon the fact that the present 
task was relatively simple in the sense 
of requiring only two different dis- 
criminations. It is quite conceivable 
that the relative effectiveness of 
training on a pictured task would de- 
crease markedly if the number of dis- 
criminative responses of the motor 
task were to be increased. For ex- 
ample, if a motor task requiring 12 
different reactions to 12 switches or 
knobs were to be employed, it seems 
quite possible that training on a paper- 
and-pencil representation of such a 
task might be found to be considerably 
less effective. At the same time, 
preliminary training which involved 
practice on a part of the total num- 
ber of discriminations required in 
such a task might very well be rela- 
tively more effective than such train- 
ing appears to be in the case of the 
present skill. 

That the results indicate positive 
transfer from training on a pictured 
task is consistent with the findings of 
studies which show that a good deal is 
learned by observation of a motor task 
by the subject. It may be pointed 
out that in the present instance, a 
distinctly active sort of observation 
was employed in the preliminary 
practice. Taken as a whole, the 
results lend support to the contention 
that the learning of perceptual rela- 
tionships is of great significance in the 
learning of many types of motor 
skills. 
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SUMMARY 


An experiment is reported which 
involves the measurement of transfer 
of training to a motor task following 
varying amounts of preliminary prac- 
tice on a paper-and-pencil representa- 
tion of the task. Five matched 
groups of subjects, each containing 
30 Navy enlisted men, were used in 
the experiment. Each group learned 
a motor skill which required four dif- 
ferent manual responses to four panel 
lights, differing in color and position. 
A control group had no preliminary 
practice; the other groups were given 
8, 16, 24, and 48 trials, respectively, 
on the paper-and-pencil task. The 
motor task, involving differential re- 
actions to two sets of switches num- 
bered 1 and 2, located on the left and 
on the right of the center line of a 
reaction panel, was practiced in all 
groups throughout 60 trials. The 


learning of the motor task was meas- 
ured in terms of time required for 


each correct reaction and in terms of 
number of errors. 

1. No significant differences be- 
tween the response time scores of the 
different groups were found on the 
first 10 trials of practice on the motor 
task. As the learning continued, 
however, significant differences be- 
tween these learning scores appeared. 
In general, the amount of transfer 
measured on the 3oth trial of final 
learning increased directly with the 
number of trials of preliminary prac- 
tice. The amount of increase was 
greatest up to 16 trials of paper-and- 
pencil practice and less thereafter. 

2. As the number of trials of pre- 
liminary practice on the paper-and- 
pencil task was increased, a progres- 
sive reduction in total number of errors 
occurring during the final learning was 
evident. As in the case of response 
time scores, the amount of decrease in 
errors was greatest up to 16 trials of 
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preliminary practice and less there- 
after. The close correspondence be- 
tween reduction in errors and reduc- 
tion in time scores consequent upon 
varying degrees of preliminary prac- 
tice suggests that the positive trans- 
fer of the paper-and-pencil practice 
was mediated largely through a reduc- 
tion in stimulus generalization. 

3. The curves of errors obtained in 
each group throughout 60 trials of 
practice on the final task exhibit 
differences in the points at which 
maximum numbers of errors occur. 
In the control group and in the group 
having 8 trials of preliminary prac- 
tice, the maximum point seems to be 
passed after the goth trial. In the 
groups having 16 and 24 trials of 
preliminary practice, the maxima are 
passed by the 2oth trial; and in the 
group having 48 trials of preliminary 
practice, after the roth trial. Thus, 
some evidence is found of a tendency 
for the point of maximum error oc- 
currence to shift progressively nearer 
to the beginning of final learning with 
increasing numbers of trials of prac- 
tice on the preliminary task. 

4. The results indicate the con- 
siderable effectiveness of preliminary 
training on a pictured representation 
of the total motor task, an effective- 
ness which is apparently greater than 
that obtained in a previous study in 
which preliminary training on a part 
of the actual motor task was given. 
In addition, they are consistent with 
results of other studies which have 
indicated the effectiveness of a variety 
of types of observation for transfer to 
the learning of a motor skill. 

5. The results are discussed in re- 
lation to a hypothesis which con- 
ceives the effect of preliminary train- 
ing on the paper-and-pencil task to 
be one of predifferentiating the stimuli 
involved in the final task, by bringing 
about a reduction in tendency to 
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generalize which exists between these 
stimuli. It is suggested that the 
interference effects evidenced on the 
first few trials of final learning may 
be related to the fact of the difference 
between the responses required in the 
preliminary and final tasks. 


(Manuscript received June 14, 1948) 
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SPREAD OF EFFECT IS THE SPURIOUS RESULT OF 
NON-RANDOM RESPONSE TENDENCIES! 


BY MONCRIEFF H. SMITH, JR. 
Harvard University 


Since Thorndike first announced 
the discovery of the phenomenon he 
called ‘spread of effect’ (8), many in- 
dependent investigations have con- 
firmed it and established firmly the 
existence of the phenomenon. How- 
ever, the acceptance of Thorndike’s 
interpretation of it has never been 
complete, and several recent experi- 
ments have suggested that a reinter- 
pretation may be necessary. 


In a typical spread of effect experiment 
a long list of words (usually 30 or more) 
is presented, one at a time, to the S. 
The S responds to each word as it is 
presented by guessing a number from one 
through ten, and is told ‘Right’ or 
“Wrong’ by the E. The S’s task is to 
guess, and learn, the correct number for 
each word in the list. Analysis of the 
responses of a group of Ss usually shows 
that responses which have been called 
correct are repeated (in response to the 
same word on the following trial), and 
that responses immediately adjacent to 
a rewarded (correct) response are re- 
peated more often than those more re- 
mote from a rewarded response. It is 
this gradient of frequency of repetition 
of erroneous responses surrounding a 
correct response that Thorndike called 
‘spread of effect.” He interpreted this 
phenomenon as evidence that the rein- 
forcing state of affairs created by E’s an- 
nouncement of ‘Right’ acted not only on 
the word-number pair for which it was 
intended, but ‘spread’ or ‘irradiated’ to 
temporally (or serially) adjacent word- 
number pairs. Thus, the single an- 


1This paper is based on a dissertation sub- 
mitted in partial fulfillment of the requirements 
for the Ph.D. at Stanford University. The 
author wishes to express gratitude to Drs. 
Hilgard, McNemar, and Taylor. 


nouncement of ‘Right’ had a tendency to 
fixate nearby erroneous responses and 
create the gradients of erroneous repeti- 
tions preceding and following the correct 
response. 

However, Zirkle (10) has recently 
shown that the gradients cannot be 
attributed to a strengthening of the con- 
nection between the stimulus word and 
the number response. In one of a 
series of experiments on the phenom- 
enon, Zirkle altered, from trial to trial, 
the order of presentation of his stimulus 
list. When the data were analyzed in 
terms of word-number connections, no 
gradients of repetitions appeared. How- 
ever, when the response series of suc- 
cessive trials were compared, without 
regard to the stimulus word to which the 
response was made, the usual gradient 
appeared. 

In addition, Jenkins and Sheffield (2) 
have reported that the gradients appear 
only if the correct response is repeated. 
In instances in which S§ failed to repeat 
a response that had been called correct, 
there was no observable trend in the fre- 
quency of repetition of nearby incorrect 
responses. This finding has been con- 
firmed by Taylor (6). 


These facts are difficult to interpret 


by the Thorndikian principle of 
spread of effect, and point to the re- 
sponse series, rather than to stimulus 
response relationships, as the source 
of the phenomenon, as Jenkins and 
Sheffield have suggested. It is the 
purpose of this paper to extend the 
response series hypothesis, and to pro- 
vide experimental tests of its impor- 
tance in the spread of effect experi- 
ment. 
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THE PrRoBABILITY Bras 
HyYporTuEsis 


Thorndike’s interpretation of the 
spread of effect experiment rests on 
the basic assumption that the prob- 
ability of the repetition of an incor- 
rect response varies only with learn- 
ing factors. That is, the probability 
of a chance repetition is assumed to 
be independent of its position with 
respect to a rewarded word. This 
condition would clearly be met if 
successive responses were independ- 
ently determined, but the non-ran- 
domness of laboratory Ss in listing 
supposedly random series of numbers 
has long been recognized. Moreover, 
Jenkins and Sheffield have pointed out 
that the repetition of a correct re- 
sponse provides an anchoring which 
aligns the response series on two suc- 
cessive presentations of the stimulus 
list. 

It follows that any consistent bias in 
the response series might act to 
alter the probability of ‘chance’ 
repetitions adjacent to the repeated 
correct response. One such consis- 
tent bias is the tendency of Ss to 
avoid sequential repetition of a re- 
sponse. Thorndike (7) provided ex- 
perimental verification of this effect 
in 1927, and Jenkins and Sheffield 
have reported its occurrence in the 
spread of effect situation. The effect 
of the bias is in the direction of creat- 
ing the spread of effect phenomenon. 
If, for example, on the first trial an S 
guesses ‘five’ in response to the word 
‘chosen,’ the bias makes it very im- 
probable that the response to the 
words preceding and following ‘chosen’ 
will be that same number. If the 
response to ‘chosen’ is called correct, 
the S will tend to repeat it on the 
following trial and to choose, again, 
some number other than ‘five’ for his 
response to the next word. Thus, if 
the range of responses is from ‘one’ 


through ‘ten,’ the choice of a response 
following a correct response is not 
made from the full range, but from 
nine numbers, and the probability of 
the same number being chosen. on 
both trials (by chance) is not one- 
tenth, but one-ninth. 

If the assumptions are made more 
rigorous, the probability sequence may 
be computed. Consider two urns, 
each containing a series of balls num- 
bered from 1 to n. A ball is drawn 
randomly and simultaneously from 
each. Let the number on one ball 
represent the numerical response to a 
particular stimulus word on the first 
presentation of the list, and the num- 
ber on the other ball stand for the 
response to the same stimulus word 
on the second presentation of the list. 
If continued simultaneous draws are 
made, and the balls replaced after 
each draw, the probability of drawing 
a pair is 1/n. However, if two balls 
are drawn and not replaced until 


after the next two are drawn (repre- 
senting the bias against immediate, 
or sequential, repetition), the proba- 
bility that any two balls drawn simul- 
taneously will constitute a pair be- 


comes more complex. If we assume 
no knowledge of the numbers on the 
two balls withheld, the probability of 
drawing ‘a pair is still 1/n. If, how- 
ever, we observe that the numbers on 
the balls that have been withheld 
constitute a pair (representing re- 
petition of the correct response), the 
probability of drawing another pair 
is 1/(m-1). The probability of a 
pair on the next draw must also be 
calculated in the light of the proba- 
bilities following the paired draw, and 
soon. The sequence of probabilities 
in the spread of effect situation, as- 
suming repetition of the correct re- 
sponse and complete randomness of 
sequence except for the bar against 
immediate repetition, runs as follows: 
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Probability of a Trial-to-Trial 


Position Repetition 


Rewarded 
Response... . I 


I 
Rae | 
n? — 3n+ 3 
(n — 1) 
(n? — 3n + 3) +(n — 1P(n — 2) 





Prob. (rf; — 1) +n"—2 
(a — 17 





It may be seen that this series con- 
verges rapidly to a close approxima- 
tion of 1/n and that the rate of con- 
vergence is a function of m. Thus, 
for 10 categories of response, the 
series runs 1.0, .III, .1001, etc., and 
for four categories of response, 1.0, 
333, -259, .251, etc. 

The above reasoning has been ap- 
plied only to the gradient following 
reward, but it may be extended in 
part to the fore-gradient. If the 
repetition of the correct response 
could be considered as a pair that 
just happened to come up in a series 
of random draws, the application 
would be complete. However, the 
repetition of the correct response has 
more the aspect of an interruption of 
of the sequence, which sets it as a new 
starting point. Thus, only if the S 
recognizes in advance that the correct 
response is coming up, and keeps in 
mind the number to be produced at 
that point, will this hypothesis apply 
to the fore-gradient. Even then it 
would probably apply only to the 
position immediately preceding re- 
ward. 

The probability bias hypothesis 
may be stated as follows: The known 
tendency of Ss to avoid sequential rep- 
etition in a supposedly random series 
of responses operates to bias the prob- 
ability of trial-to-trial repetitions of 
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erroneous responses adjacent to a cor- 
rect response. The direction and extent 
of the bias are such as to produce 
gradients of erroneous repetitions re- 
sembling the spread of effect phenom- 
enon. 

It is recognized that this statement 
is incomplete. There are other 
sources of bias in the series of prob- 
abilities of trial-to-trial repetitions, 
some that can be specified now and 
some that cannot. A broader state- 
ment of the hypothesis would em- 
brace the possibility that the entire 
spread of effect phenomenon could be 
deduced from (a) the observed fre- 
quency of trial-to-trial repetition of 
the correct response in the spread of 
effect experiment, and (b) the devi- 
ations from randomness in response 
series listed by Ss. 


Tue ReEvationsHip BETWEEN THE 
PropaBi.tity Bras Hyporuesis 
AND THE GUESSING SEQUENCE 

Hyporuesis 


The guessing sequence hypothesis 
proposed by Jenkins and Sheffield 
(2) assumes that Ss tend to guess 
sequences of numbers, so that for a 
particular S ‘three’ might very often 


be followed by ‘four.’ Any such 
tendency, together with the ‘anchor- 
ing’ of the response series produced by 
the repetition of the correct response, 
should produce a gradient of errone- 
ous repetitions following the correct 
response. 

Although the guessing sequence 
hypothesis depends primarily on in- 
dividual response sequences, Jenkins 
and Sheffield noted in their data two 
general response tendencies. One 
was the tendency to avoid immediate 
sequential repetition of a response, 
and the other a tendency for Ss to 
choose successive responses from the 
same neighborhood of the cardinal 
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scale. These general tendencies may 
be considered as indirect evidence for 
individual guessing sequences, or they 
may be looked upon as evidence of the 
failure of the response series to coin- 
cide perfectly with the formal model 
of the random series. 

The probability bias hypothesis 
takes the latter view. The sources 
of group tendencies might include 
individual preferences for fixed series, 
but this seems to be another problem. 
If the spread of effect phenomenon 
can be deduced from the study of 
group biases in response series, then 
no further assumptions need be made 
in dealing with spread of effect. The 
facts correlated with the failure of 
the group to respond in a random 
fashion can probably best be studied 
in some other experimental situation. 


OtrHerR Non-Ranpom AsPEctTs 
oF Response SERIES 


The second group tendency noted 
by Jenkins and Sheffield, the tend- 
ency for Ss to choose successive re- 
sponses from the same neighborhood 
of the number scale, has not been 
included in the hypothesis offered 


here. However, it is obvious that 
this tendency acts to restrict further 
the range of responses to the words 
preceding and following a repeated 
correct response. This restriction in- 
creases the slope of the gradients of 
‘chance’ repetitions surrounding a 
correct response. 

There is still another known tend- 
ency that might influence spread of 
effect data. Ss tend to run through 
the whole sequence of available num- 
ber responses before repeating one. 
That is, they avoid not only immedi- 
ate sequential repetition, but also 
repetition several steps removed. If 
this tendency were complete, it would 
make the response series invariable, 
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and would remove the bias in trial-to- 
trial repetitions introduced by failure 
to repeat immediately. However, the 
tendency is only partial, and has the 
effect of making the trial-to-trial rep- 
etition gradient less steep. One way 
of demonstrating this deviation from 
randomness is by a plot of the dis- 
tribution of the number of steps from 
the occurrence to recurrence of a 
particular number. Since this dis- 
tribution plays a part in one of the 
experiments to be described, a brief 
description of it is given here. 

By one definition of randomness, 
each element of a series has an equal 
chance of appearing in any position in 
the series. If we consider a series 
composed of digits from ‘one’ through 
‘six,’ and examine the sequence follow- 
ing a position in which ‘one’ appears, 
‘one’ has an equal probability (%) of 
appearing in any of the following 
positions. If, however, we consider 
the probability of ‘one’ first recurring 
after a fixed number of steps it is no 
longer 3. The specification of first 
recurrence indicates that we need the 
joint probability of the recurrence of 
‘one’ on, and not before, that step. 
Thus, the probability of an immediate 
recurrence is $. The probability of a 
first recurrence one removed (two 
steps) is § X 4, or the probability of 
a ‘non-one’ and then a ‘one.’ For 
the third step the figure is § X § X %, 
etc. The resultant series is a mono- 
tonically decreasing function, asymp- 
totic to zero. If the series is limited 
to 30 numbers, each considered as an 
equally likely starting point, a slight 
correction is needed. The correction 
for the finite length of the list consists 
of multiplying the probability of a 
recurrence r steps from the starting 
point by (30-r)/30. The corrected 
plot is shown by the solid line in Fig. 
1. It should be noted that the sum 
of the probabilities is not one, since a 
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number cannot recur after its last 
appearance in the series of 30. 

On the other hand, if a series of 30 
terms given by a laboratory S is 
analyzed in terms of the number of 
steps till the first recurrence of a given 
number, the plot looks quite different. 
The dashed line in Fig. 1 represents 
such a plot. The data were taken 
from the numbers listed by Ss on the 
first presentation of a stimulus list in 
a spread of effect experiment. The 
curve is an average of 15 Ss, and each 
S gave seven lists of 30 units (digits 
from ‘one’ through ‘six’). In making 
up the curve for each S a count was 
made of the number of steps from the 
occurrence of each number to its first 
recurrence in the same list. The re- 
sulting frequency distribution was 
converted to relative frequency by 
the division of each frequency by 210 
( 7 lists, 30 numbers per list). This 
process puts the empirical curve and 
the theoretical random curve on the 
same base. 

The form of the curve was the same 
for all 15 Ss, but the individual modal 
points varied from the third to the 
fifth steps. 

Fig. 1 could have been plotted in 
terms of probability of appearance 
(rather than first recurrence) of a 
number. In this case the random 
series curve would have been a hor- 
izontal straight line, since the prob- 
ability of an event in such a series is 
independent of preceding events. If 
the finite list length were taken into 
account, the line would have a slight 
negative slope. The exact form of 
the empirical curve cannot be pre- 
dicted from the data of Fig. 1, but it 
is reasonable to assume that the curve 
would have continued to rise and 
would have approached the random 
series line asymptotically. For the 
first three steps appearance and first 
recurrence are practically the same 
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things. However, there were prob- 
ably some reappearances of a num- 
ber four steps after the point being 
examined which were second, rather 
than first, recurrences (as in the se- 
quence I, 5,1, 4,1). For most Ss, at 


least, the bias against re-use of a 
number would extend partially over 
four steps. 


EXPERIMENTAL Tests IN Non- 
LEARNING SITUATIONS 


Unless additional biases counteract 
the mechanism of the hypothesis, it 
should be possible to demonstrate 
artificial ‘spread of effect’ gradients 
in any series of responses given by 
human Ss. The hypothesis requires 
only that two segments of the series 
be aligned on the basis of a response 
common to both segments. The 
method actually used was that of 
requesting Ss to give a number of 
short series, each containing a specified 
number. Series containing the same 
specified number were then paired, 
and the position of other (chance) 
pairings was examined. 


PROCEDURE 


Parallel number series were obtained in two 
slightly different ways. In the first test, a group 
of 31 Ss were requested to write series of five 
numbers from ‘one’ through ‘ten,’ beginning or 
ending with a specified number. Thus, E might 
ask for five numbers ‘beginning with five,’ or 
‘ending with three.’ When the last number of 
the series was specified, Ss were instructed to 
write down their series with the specified number 
being actually written last. One hundred five- 
number series were required of each S—so to a 
page. The itcms at the top and bottom of each 
page were discarded, leaving 40 to a page for 
consideration. Each page then contained two 
series beginning, and two ending, with each of 
the ten possible numbers. 

In the second test the series were 13 numbers 
in length, and the specified number fell in the 
ninth position. Sheets with rows of 13 spaces, 
with the ninth space enclosed in double lines, 
were given to the Ss. Series were written from 
left to right, with the specified number falling in 
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STEPS TILL RECURRENCE 


Fic. 1. 


Distribution of steps from occurrence to first recurrence of a particular number in a 


random series of numbers and in series given by laboratory Ss 


the space between the double lines, so that eight 
numbers of the series preceded it, and four fol- 
lowed it. The series were again composed of 
numbers from ‘one’ through ‘ten.’ Each S 
wrote 23 series, of which the middle 20 were con- 
sidered. Thirty-seven Ss participated in this 
test. 

Both tests were presented to the Ss as studies 
of the manner in which they selected random 
numbers. Instructions in both were to write as 
rapidly as was convenient; and in both, a series 
was covered by a card as soon as it was written. 


REsuULTs 


Table I presents the results of the 
first test—the series-to-series repeti- 
tions when each series contained five 
numbers, the first or last of which was 


specified. Table II presents the 
series-to-series repetitions for the sec- 
ond test, in which the specified num- 
ber fell in the ninth position of a 
thirteen-number series. In both tests 
there were occasional instances in 
which the S did not write the number 
as specified. Such series were dis- 
carded, so that the total number of 
observations differed slightly from the 
expected number. 

The distribution of repetitions in 
both tests bears a pronounced resem- 
blance to the usual spread of effect 


gradients, for two steps before or after 
the specified number. The high fre- 
quency of repetitions at the beginning 
of the series in both tests probably 
arose from the same sources as the 
repetitions immediately following the 
specified numbers. Although the 
specified number was actually written 
in its proper place in the sequence, the 
announcement of the number served 
as the signal for starting the series, 
so that the effect was probably much 
the same as if this number had been 


TABLE I 


Sertes-To-Series REPETITIONS IN A 
Non-Learninc SITUATION 
(31 Ss) 
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Percent 
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TABLE II 


Series-To-Series RepeTtiTions 1n A Non-Learninc Situation 
(37 Ss) 











Number 62 
Percent 16.8 











Specified Number 


+4 





369 49 
100 13.3 








27 47 
7-3 12.7 











written down at the beginning of the 
series as well as in its specified place. 
The modified procedure of the second 
test was designed to remove this 
initial effect further from the specified 
number, in order to produce a less 
complicated picture of the repetitions 
immediately preceding the specified 
number. 

The two tests agreed in showing 
gradients of repetition around the 
specified number. They differed in 
that the shorter series yielded con- 
sistently higher percentages of repeti- 
tion in series-to-series comparisons. 
In both tests the series were composed 
of numbers from ‘one’ throught ‘ten,’ 
so the source of the difference must 
lie in the length of the series. The 
probability bias hypothesis would 
predict in this case repetition percents 
of 11.1, 10.01, 10.00 and 10.00 for the 
four steps following the specified 
number. The somewhat higher over- 
all frequency of repetition indicates 
that additional factors are necessary 
for precise prediction of probability of 
repetition. The tendency to choose 
successive numbers from the same 
neighborhood undoubtedly played a 
part. 

It should be noted that the fore- 
gradient, although appearing in both 
tests, was smaller than the after- 
gradient. The results in this non- 
learning situation agree with the 
empirical findings in the typical 


spread of effect experiment, and lend 
support to theories not requiring the 
reinforcement principle. 


TEsTs IN THE SPREAD OF 
Errect Situation 


Spread of effect has been demon- 
strated only in situations in which S 
was allowed to choose the response, 
but it has been’ assumed that the 
Thorndikian principle has a much 
wider application. The experiments 
reported here were designed to pro- 
vide a situation in which spread of 
effect could be tested when the re- 
sponse series was assigned by E, 
rather than by S. 

The accomplishment of this aim 
required several modifications of ex- 
perimental technique, and Experi- 
ment I was made to test the effect of 
these modifications on the phenom- 
enon. Experiments II and III were 
planned to test the probability bias 
hypothesis. 


PROCEDURE 


Experiment I—Response Numbers 
Selected by S 


Experiment I employed a word-number learn- 
ing situation. The stimulus words were two- 
syllable adjectives, combined in lists of 30 words. 
There were seven such lists—one practice list and 
six experimental lists—each presented to the Ss 
two times only. The lists were recorded on six- 
in. Presto Monogram disks, and played back to 
the subjects at the rate of four sec. per word. 
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At a recording and playback speed of 33} r.p.m., 
the six-in. disk held one presentation of the list 
per side. 

The lists were played to groups of Ss (approxi- 
mately 20 Ss to a group), and each individual 
recorded his responses on a specially prepared 
data sheet. The group presentation made it 
impossible to call responses correct or incorrect, 
so that the task could not be presented as one of 
guessing numbers. Instead Ss were instructed 
to write down, in a random fashion, one number 
as each word was played the first time a list was 
presented. The range of response was limited to 
numbers from one through six. About 14 sec. 
after the playing of certain words, the E an- 
nounced ‘Remember.’ Ss were told that their 
task was to repeat (on the second playing of the 
list) their responses to the words just preceding 
the announcement of ‘Remember,’ and to avoid 
the repetition of all other responses. That is, 
they were to select, at random from the first six 
numbers, some number other than the one that 
they wrote down the first time through the list. 
It is believed that these instructions had the 
effect of making explicit the instructions an S 
gives himself in the traditional spread of effect 
experiment. For convenience of exposition, 
words and responses marked by the announce- 
ment of ‘Remember’ will be referred to in this 
report as ‘correct,’ and all others as ‘incorrect.’ 
There were three correct responses in each list. 
These fell in the 7th, 16th, and 25th positions. 

The data sheets consisted of a pair of mimeo- 
graphed forms, stapled together with a sheet of 
carbon paper between them. Ss were provided 
with blunt-pointed styluses, so that the numbers 
they wrote on the top sheet were invisible to 
them, but appeared clearly on the bottom sheet. 
As an additional precaution, Ss were provided 
with a four-by-ten-in. card, which they used to 
cover a column of responses as soon as it was 
completed, leaving only the unused portion of 
the page exposed. This recording procedure had 
the principal disadvantage that it was very easy 
for the S to lose his place on the sheet. Since 
the position of a number on the sheet was the 
only cue as to the word with which it went, any 
deviation invalidated the data from that list. 
Ss were requested to mark any list on which they 
lost their position. Such lists were usually recog- 
nizable by gaps in the number sequence, or by an 
excess or insufficiency of numbers in the list. 

Four approximately equal groups, totaling 73 
Ss, were run under these conditions. 


Experiment II: Random Numbers 
Assigned by E 


Experiment II followed the pattern of Experi- 
ment I, except that the response series was as- 
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signed by E, rather than being selected by S. 
On the first trial on each list E called out a num- 
ber (which the Ss wrote down as their response) 
for each word as it was played. The second trial 
on each list was identical for the two experi- 
ments. Ss attempted to repeat the responses 
marked by ‘Remember,’ and to avoid repetition 
of all others. Four groups, a total of 72 Ss, 
were run. 

The number series assigned by E were drawn 
from a table of random numbers, with the single 
restriction that no number could appear more 
than three times in succession. Numbers from 
one through six were again used. 


Experiment III: Biased Number 
Series Assigned by E 


Experiments II and III were identical except 
for the sequences within the response numbers 
assigned. The response lists employed in Ex- 
periment III were carefully constructed to con- 
form to average subject bias, being made to fit 
the observed distribution of steps-till-recurrence 
shown in Fig. 1. That is, no number was used 
twice in succession (a slight exaggeration of the 
observed tendency), the incidence of numbers 
recurring after only one intervening number was 
less than chance, etc. A careful attempt was 
made to keep the recurrence distribution as the 
only criterion of selection, so that each number 
had an equal chance of following any other 
number. However, the nature of the restriction 
imposed precluded any ‘random’ drawing of 
number series, so that the possibility of recurring 
patterns of specific numbers cannot be com- 
pletely excluded. 

Four groups, totaling 74 Ss, were run. A 
different response list was used for each group. 
Response numbers one through six were again 
used. 


RESULTS 


The essential features of the results 
are summarized in Fig. 2, which com- 
pares the repetition gradients of the 


three experiments. Evidence of a 
fore-gradient was poor in all three. 
Experiment I showed more erroneous 
repetitions in the position just pre- 
ceding the correct response than in 
the position two steps before it, but 
the still higher frequency of erroneous 
repetitions in positions —3 and —4 
removed any evidence of a trend in 
the position means. In Experiment 
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II the lowest number of repetitions 
was in the position just preceding the 
correct response. Experiment III 
showed a fairly regular trend of 


position means, and thus gave some 
indication of the usual spread of effect 


fore-gradient. 

The after-gradient, however, ap- 
peared when Ss selected response 
numbers or when the biased response 
series were assigned (Experiments I 
and III). In Experiment I the 
gradient included only the position 
immediately following the correct 
response, but in Experiment III the 
trend was extended over all four 
positions. When the response num- 
bers were assigned from a random 
series (Experiment II) no gradient 
appeared. What trend there was 
opposed the usual spread of effect 
gradients. The correct response was 
repeated with approximately equal 
frequency in the three experiments 
(Table III). The average number of 
repetitions per S was 12.38, 13.21, and 
14.20 for Experiments I, II and III, 
respectively. 

The results are expressed in terms 
of mean repetitions per S. Since 


Spread of effect under three experimental conditions 


each list had three correct responses, 
and each S had six lists, there were 18 
possibilities of repetition in each posi- 
tion. Consequently these means may 
be converted to percent repetitions 
by dividing by 18. The expected 
number of repetitions per S (if chance 
factors alone were operating) is three. 

Expression of results in terms of 
repetitions per S, by position, allowed 
the application of analysis of variance 
techniques to the problem of evalu- 
ation of the gradients. It should be 
noted that had each possibility of 
repetition been an independent event, 
these position distributions would 
have been binomial in nature (p = 3, 
n = 18), and a transformation of 
scores would have been required to 
remove the association of mean and 
variance. However, an examination 
of means and variances, by subjects 
and by positions, showed so little 
evidence of relationship between the 
two that the transformation was 
deemed inadvisable. 

The analysis examined the variance 
associated with Positions, Individuals, 
and Groups (the total N for each 
experiment was made up of four 
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TABLE III 


Frequency or Erroneous Repetitions in Eacu Position Expressep as MEAN NuMBER 
Per S anv as Percent or PossiB_eE REPETITIONS 








Fore-gradient 


After-Gradient 
Correct 





—3 





Response 





Experiment I 
73 Ss) 
Position means 
Expressed as percent 


Experiment II 
72 Ss) 
Position means 
Expressed as percent 


Experiment III 
74 Ss) 
‘osition means 
Expressed as percent 











14.8 























groups). However, since the vari- 
ability associated with Groups and 
with interactions involving Groups 
was uniformly non-significant, this 
variable may be excluded from con- 
sideration, leaving only Positions and 


Individuals as important sources of 


variability. Separate analyses were 
made for the positions preceding and 
following the correct response. 

Pressure for space precludes pres- 
entation of the variance tables here. 
In general, the analyses confirmed the 
implications of Fig. 2, but several 
points should be noted. 

1. The variance associated with 
position means following the correct 
response was significant in Experiment 
III, but was not in Experiment I or II. 
It might be argued from the lack of 
significance in Experiment I that the 
modification of experimental tech- 
nique altered the phenomenon to be 
observed. However, other experi- 
ments (not reported in detail here), 
employing the same technique but 
varying the range of response num- 
bers or the positions of the correct 
responses, showed comparable gradi- 
ents following the correct response. 


It would seem, therefore, that the 
after-gradient observed in Experi- 
ment I is a repeatable phenomenon, 
even though the variance of position 
means in Experiment I was not 
statistically significant. 

2. In no case was there statistically 
significant evidence of a fore-gradient. 
Even in Experiment III, in which the 
position means preceding the correct 
response showed a regular progression, 
the variability of these means was far 
from significant. Examination of the 
position means of the four groups 
revealed no regularity within groups, 
so that the gradient of the total posi- 
tion mean’s can probably best be con- 
sidered as a sampling fluctuation. 

3. It is of some interest that reliable 
individual differences in overall fre- 
quency of erroneous repetitions ap- 
peared in the spread of effect experi- 
ment. Factors such as consistent 
differences in  number-favoritism 
might well have accounted for this 
difference between Ss. 

One further analysis, a direct com- 
parison of Experiments I and III, was 
made. The analysis of variance was 
in this case a modification of the 
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usual three-way analysis. Since dif- 
ferent individuals were employed in 
the two conditions, the usual ‘indi- 
vidual’ mean would have been the 
mean of two different individuals 
matched at random. However, com- 
bining the sum of squares for Indi- 
viduals with that for the Individual 
by Condition interaction term gave an 
estimate of the variance of the indi- 
vidual around his own condition mean. 
The Remainder term was a combina- 
tion of the triple interaction and the 
Position by Individual interaction, 
which makes it essentially an Individ- 
ual by Position interaction, com- 
puted independently for each condi- 
tion. The F’s for Condition tested 
the hypothesis that the overall fre- 
quency of repetition was the same for 
the two conditions. Although this 
ratio was not significant in the after- 
gradient, it was highly so in the fore- 
gradient. Examination of the means 


shows a greater frequency of repeti- 


tion in both cases in Experiment I, 
and since the conditions of the experi- 
ments linked the fore- and after- 
gradients, it is probable that the 
conditions of Experiment I led to a 
greater frequency of erroneous trial- 
to-trial repetitions. 

The other F-ratio of importance in 
this analysis was that for the Position 
by Condition interaction, which tested 
the hypothesis that the trend of 
position means was the same for the 
two experiments. Since this ratio 
fell below the five percent level of 
confidence in both the fore- and after- 
gradients, there was no conclusive 
evidence of a difference between the 
experiments in this respect. How- 
ever, the observed difference in the 
overall frequency of repetition and 
the size of the Position by Condition 
interaction F for the after-gradient 
(just below the five percent level) 
make it difficult to conclude that the 
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conditions of Experiment III repro- 
duced faithfully the phenomenon 
found in Experiment I. 


Discussion 


The principle of spread of effect 
assumes the direct action of a rein- 
forcing state of affairs—not only on 
the stimulus-response connection to 
which the reinforcement belongs, but 
also on adjacent connections. The 
principle requires only a sequence of 
stimulus-response connections, one of 
which is acted upon by reinforcement. 
The conditions of Experiment II, in 
which number responses were assigned 
by the E from a random list of num- 
bers, met this requirement; yet spread 
of effect did not appear. 

It can scarcely be argued that the 
announcement of ‘Remember’ failed 
to constitute a reinforcing state of 
affairs, since 73.4 percent of the re- 
sponses reinforced in this fashion were 
repeated on the second trial. Any 
definition of reinforcement must in- 
clude a situation which produces such 
an obvious increase in the strength of a 
single response. Moreover, in Ex- 
periment I the phenomenon, or at 
least the after-gradient, did appear, 
although the two experiments differed 
only in the manner in which the 
number responses were originally 
chosen. When Ss were allowed to 
select the number responses on the 
first presentation of each list, the 
phenomenon appeared. When the 
numbers were assigned from a random 
table, the phenomenon did not appear. 
The most reasonable interpretation 
would seem to be that something other 
than irradiation or spread of reinforce- 
ment was responsible for the occur- 
rence of the after-gradient in Experi- 
ments I and III. 

The probability bias hypothesis, 
on the other hand, is quite consistent 
with the observed results. In fact, 
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since the bias against sequential rep- 
etition has been adequately demon- 
strated in previous experiments, these 
results serve only to confirm the as- 
sumption that there are no undetected 
biases in the number sequences given 
by an S which tend to nullify the 
gradients of erroneous repetitions. 
It is possible to argue that the 
phenomenon observed here is not the 
same as spread of effect—that the 
modification of experimental tech- 
nique has destroyed the phenomenon 
to be studied and substituted a similar 
one in its place. There is no com- 
pletely satisfactory answer to this 
argument, since it involves a compari- 
son of gradients obtained under quite 
different conditions. All the biasing 
conditions present in Experiment I of 
this series occur in the typical spread 
of effect experiment, so that spread of 
reinforcement could appear only as an 
increase in an already existent gradi- 
ent. The findings of Jenkins and 
Sheffield, that the gradients occur 
only when the correct response is 
repeated, argue against such an inter- 
pretation, but it is not impossible. 
Had a fore-gradient appeared in 
Experiment I, the results would have 
seemed to be more satisfactory. It 
has been suggested by Tilton (9) and 
by Jenkins and Sheffield that the fore- 
gradient occurs as an artifact of the 
after-gradient, due to inadequate con- 
trol of the spacing of correct responses. 
However, the experiments of Farber 
(1) and Zirkle (10,11), in which the 
spacing of correct responses was care- 
fully controlled, agreed in the finding 
of a fore-gradient. Under the condi- 
tions of the Farber and Zirkle experi- 
ments, the probability bias hypothesis 
would predict a fore-gradient. When 
the same list is presented repeatedly, 
S has an opportunity to anticipate 
words to which his previous responses 
have been correct, so that after the 
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first trial or two the responses pre- 
ceding the correct response should be 
dependent on the correct response in 
much the same fashion as the succeed- 
ing responses are on all trials. 

In the present experiments, each 
list was presented only two times. 
This procedure allowed the Ss very 
little opportunity to become ac- 
quainted with each list, so that not 
much evidence of a fore-gradient 
would be expected on the basis of the 
probability bias hypothesis. Martens 
(3) used the two-presentation tech- 
nique with individual testing of Ss 
and announcements of ‘Right’ and 
‘Wrong,’ and reported a fore-gradient. 
Her experiment differed from the ones 
reported here in range of response 
numbers, length of word list, and 
number of correct responses per list, 
as well as in the method of presenta- 
tion of the task, so that the slight dif- 
ference in the results cannot be inter- 
preted with any certainty. Indeed, 
it is quite possible that repetition of 
Experiment I would show the high 
values of the means in positions —3 
and —4 to be sampling fluctuations, 
so that some evidence of a fore- 
gradient would appear. 

Jenkins and Sheffield have pointed 
out the applicability of their guessing 
sequence hypothesis to the interpreta- 
tion of other experimental demon- 
strations of spread of effect. The 
similarity of the probability bias hy- 
pothesis to their hypothesis makes it 
unnecessary to repeat the argument, 
but one thing should be added. 

The phenomenon of spread of 
variability (1,4,5,11) has received 
no attention in the series of experi- 
ments reported here, but the probabil- 
ity bias hypothesis applies to it as 
well as to spread of effect. In the 
experimental demonstration of spread 
of variability the observed gradients 
lie in the correct responses around an 
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incorrect response, so that a single 
incorrect response is isolated for study. 
In this case the repetition of the 
marked response is lower than that of 
the surrounding responses. In the 
comparison of two number series con- 
taining the bias against sequential 
repetition, the probability of a trial- 
to-trial pairing is a function of the 
point of the parallel series chosen for 
study. If the point chosen is one at 
which there is a pair, the probability 
of a pair in the following members of 
the series is increased over the un- 
biased chance probability. If the 
point chosen is one at which the two 
series have different numbers, the 
probability of pairing the immedi- 
ately adjacent numbers is reduced. 
Thus, the low frequency of the trial- 
to-trial repetition of the marked re- 
sponse in the spread of variability 
situation decreases the probability of 
repetition of the adjacent correct re- 
sponses. In addition, within a series 


of responses that the S is trying to 
repeat, every repetition increases the 
probability of repetitions in adjacent 


positions. This action would be ex- 
pected to amplify the effects of non- 
repetition at the end of the series, and 
to produce a peak of repetition in the 
middle of the series of correct re- 
sponses. These two sources of bias 
would seem to account fairly well for 
the observed instances of spread of 
variability. 

As a final point, the guessing se- 
quence and probability bias hypoth- 
eses might be discussed in the light of 
the findings reported here. Both 
hypotheses would predict the results 
of the tests in the non-learning situ- 
ations, and of Experiments I and II. 
In Experiment III, however, the 
operation of individual guessing se- 
quences was virtually precluded by 
the fact that the same sequence of 
responses was assigned by E to a 
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group of Ss. The gradients of errone- 
ous repetitions appeared unmistak- 
ably in this experiment, so that it 
seems more profitable to think of the 
phenomenon in terms of biases in re- 
sponse sequences shared by all Ss, 
rather than in terms of individual se- 
quences. It should be pointed out, 
however, that the single bias against 
sequential repetition of numbers is 
not sufficient to account for the re- 
sults of Experiment III. The prob- 
ability calculations presented in the 
first section of this paper indicated 
that the frequency of erroneous repe- 
titions should drop quickly back to a 
chance base-line, and that the bias 
should be observable in, at most, the 
first two positions following the cor- 
rect response. Although it is difficult 
to determine a ‘chance’ base-line in 
the spread of effect experiment, the 
fact that the after-gradient in Experi- 
ment III extended over four positions 
following the correct response indi- 
cates an incomplete approximation of 
the theoretical model, and suggests 
that some additional source of bias 
might be sought. One source might 
be that proposed by Jenkins and 
Sheffield—the tendency of Ss to choose 
succeeding responses from the same 
neighborhood within the range of 
possible responses. Another source 
might be the distribution of steps- 
till-recurrence, on which the response 
series assigned in Experiment III were 
based. 


SUMMARY 


1. This paper presented further 
evidence that the phenomenon of 
spread of effect should be interpreted 
as an artifact of the experimental situ- 
ation in which it has beem demon- 
strated. The probability bias hy- 
pothesis was proposed to show that 
known biases in sequences of number 
(or place) responses given by Ss can 
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produce the gradients of erroneous 
trial-to-trial repetitions around the 
correct response. 

2. Tests of this hypothesis in non- 
learning situations were described. Ss 
were asked to write down series of 
numbers, ‘random’ except for one 
specified number. Series containing 
the same specified number were then 
paired, and the distribution of series- 
to-series pairing of responses was ex- 
amined. These tests showed gradi- 
ents of paired responses preceding and 
following the specified number. 

3. Three experiments, designed to 
test the hypothesis in the spread of 
effect situation, were reported. The 
usual experimental technique was 
modified to make possible the assign- 
ment of response numbers by E£ on the 
first of the two presentations of each 
word list. In the first experiment, Ss 


were allowed to select response num- 
bers on the first trial, so that the ex- 
periment was essentially a test of the 


modified technique. The gradient of 
repetitions following the correct re- 
sponse appeared as usual. 

In Experiment II conditions were 
identical, except that the response 
numbers were assigned by £ from a 
table of random numbers. Under 
these conditions there was no spread 
of effect. The slight trend that did 
appear was in opposition to the usual 
gradients. 

Experiments III and II were identi- 
cal except that in III the responses 
were assigned from a list constructed 
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to conform to subject bias. Under 
these conditions the after-gradient 
appeared to a significant degree, and 
there was some evidence of a fore- 
gradient. 


(Manuscript received June 7, 1948) 
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MULTIPLICITY OF SET AS A DETERMINANT 
OF PERCEPTUAL BEHAVIOR! 


BY LEO POSTMAN AND JEROME S. BRUNER 


Harvard University 


Perception is a selective process. 
At any moment, the organism experi- 
ences only a fraction of what is po- 
tentially available for perception. 
The fact of selectivity is well estab- 
lished, but only a few of its determi- 
nants are clearly known. The role of 
set, so often invoked in explanation of 
perceptual selectivity, is but poorly 
understood. Systematic variations in 
specific characteristics of set have yet 
to be related to measurable, and 
equally specific, changes in perception. 
The present paper is concerned with 
the effects on perceptual behavior of 
variation in one dimension of set: 
multiplicity, i.e., the range or extent 
of alternative stimuli for which the 
organism is perceptually prepared. 

Traditionally, experiments on per- 
ception have put paramount em- 
phasis on the painstaking analysis of 
the stimulus situation, often relegat- 
ing the contribution of the experienc- 
ing organism to a residual category of 
set. But set, no less than the stim- 
ulus itself, has many dimensions. 
Like the stimulus, set may vary in 
intensity, ranging from a vague, un- 
verbalized readiness to an intense and 
explicit expectation. Sets may be 
single or multiple. A single set is 
characterized by readiness to perceive 
one circumscribed, clearly defined 
class of events in the environment 


1 This experiment was carried out as part of 
a research project on the cognitive processes 
under the auspices of the Laboratory of Social 
Relations of Harvard University. The authors 
wish to acknowledge their gratitude to Dorothy 
Postman and Leta Cunningham for their in- 
valuable assistance in the collection and analysis 
of data. 


and no other. Single sets can be suc- 
cessfully established in the laboratory, 
although with some difficulty. The 
subjects in Kiilpe and Bryan’s classic 
experiments who were instructed to 
look only for color or only for form 
had single sets (7). The observer in 
a psychophysical experiment who at- 
tempts to judge one experienced 
attribute (pitch, loudness, brightness, 
and so on) to the exclusion of all other 
characteristics of the stimulus oper- 
ates under a single set. 

Outside the laboratory, however, 
single sets are the exception. We are 


habitually ready to perceive alterna- 
tively and to react alternatively to a 
wide variety of objects and events in 


our environment. Such perceptual 
readiness to respond alternatively to 
many classes of events constitutes 
multiple set. Multiplicity, of course, 
varies in degree. There may be 
readiness for two or three classes of 
events, each clearly circumscribed 
and defined, such as in the classical 
disjunctive reaction-time experiment. 
There may be, at the other extreme, 
readiness to respond to an almost un- 
limited range of stimulus objects. 
By way of an analogy, we may say 
that sets vary in sharpness of tuning. 
At one pole there is sharp tuning to 
one ‘frequency’ of objects; at the 
other, indiscriminate receptivity. 
Set, then varies continuously in de- 
gree of multiplicity or sharpness of 
tuning. 

A multiple set need not necessarily 
result in richer, more varied, and 
more effective perceptual responses. 
The various classes of events which a 
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multiple set encompasses may or may 
not be perceptually compatible with 
each other. They may sometimes 
fuse in a harmonious pattern—the 
taste, smell, and texture of a new 
dish. But readiness to respond to a 
wide range of stimuli may also lead 
to incompatible perceptual alterna- 
tives: trying to watch the traffic and 
scenery at the same time may make 
both activities less effective. Inde- 
cision or attempts at quick alterna- 
tion may impoverish perceptual be- 
havior. 

The experiment with which we are 
concerned deals with a comparison of 
perceptual behavior under single and 
multiple sets. Our special interest is 


in two parallel problems: (1) what 
are the characteristics of perceptual 
behavior which change with changes in 
the multiplicity of set, (2) what are 
the probable mechanisms through 
which multiplicity of set affects per- 
ceptual response? ? 


Tue EXPERIMENT 


Our independent variable, then, is singleness 
versus multiplicity of set. Our dependent vari- 
ables consist of an array of measures describing 
the speed, accuracy, content, and pattern of 
perceptual responses under single and multiple 
sets. Single set was established by preparing S, 
through instruction, for the perception of one 
circumscribed class of stimulus objects. Mul- 
tiple set was induced by leading S to expect the 
appearance of one of two possible classes of 
stimulus objects. Under multiple set, S knew 
that either one or the other of these two classes 
of objects would appear but he never knew 
which, and consequently had to be ready for 
both. By the nature of the experimental situa- 
tion, the two classes of objects for which S was 
set could never be perceived simultaneously or 
fused. S was forced into alternation between 
perceptual choices. 


2 In some respects the design of this perceptual 
experiment parallels that used in such classical 
studies of voluntary choice as those of Ach (1) 
and Michotte and Priim (8). The communality 
of design points up the similarity of the problems 
relating to set in the fields of perception and 
reaction. 
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Our subjects were 20 undergraduate students. 
Each S individually had the task of recognizing 
words presented at rapid exposures in a Dodge- 
Gerbrands tachistoscope (5). On each stimulus 
card, two words were typed in capital letters 
cross-wise to each other at an angle of 45 degrees. 
One of the words of each crossed pair was either 
a color word or a food word, the other a neutral 
word. The three kinds of words occupied the 
two positions in the crossed pair in balanced 
random order, so that position habits could have 
no biasing effect. 

In a Single Set Series, eight such crossed pairs, 
each containing a color word and a neutral word, 
were presented to the Sone atatime. His task, 
described fully in the instructions which follow, 
was to recognize the color word in each pair as soon 
as possible. These instructions were given 
orally. 


I am going to show you some words in this 
apparatus and I want you to tell me everything 
you see or think that you see. I will show 
them to you only for a brief moment, but I 
want you to tell me everything that you see or 
think you see. The words will be in pairs like 
this. [The S is shown three sample pairs: 
letter-hathox, mouse-horse, animal-twenty; and 
his threshold for the recognition of one of the 
words in each pair is determined by a modified 
method of limits, using, for obvious reasons, 
only an ascending order of exposures. ] 

One of the words in each pair that I’m going 
to show you will be a color word, that is, a 
word standing for acolor. Your task is to see 
the color word as fast as you can. Sometimes 
the color word will be in the slanting-up posi- 
tion, sometimes in the down position. Always 
looking first at the word slanting one way 
won’t help you: Your task, then, is to see the 
color word in the pair just as quickly as you 
can. All right, are you ready to start? Any 
questions? , When I call, “Ready,” you look 
into the apparatus, keeping your eyes on the 
red dot [fixation point in the pre-exposure 
field]. The words will appear there. Ready! 


In a Multiple Set Series another eight word 
pairs, each containing a color word, were used. 
To this set of eight word pairs were added four 
pairs, each containing a food word. The other 
member of these pairs was always a neutral word. 
Since color pairs and food pairs were alternated 
in a predetermined random order, it was not 
necessary to have an equal number of each. 
Food pairs, therefore, were limited to four in the 
interest of minimizing subject fatigue. Color 
pairs, on the other hand, were kept equal in 
number in the two series for purposes of com- 
parison. For the Multiple Set Series the task 
was to recognize either the food word or the color 
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word in each pair. The following instructions 
were given: 


One of the words in each pair of words that 
I am going to show you will be either a color 
word or a food word, that is, a word standing 
for a color or a food. Your task is to see the 
color word or food word as soon as you can. 
Sometimes the word you are looking for will 
be in the slanting-up position, sometimes in 
the down position. Always looking first at 
the word slanting in one direction will not help 
you. Your task, then, is to see the color word 
or food word in the pair just as quickly as you 
can. All right, are you ready to start? Any 
questions? When I call, “Ready,” you look 
into the apparatus, keeping your eyes on the 
red dot [fixation point in the pre-exposure 
field]. The words will appear there. Ready! 


The 20 Ss were divided into four equal groups. 
Two of the groups started with the Multiple Set 
Series, followed by the Single Set Series.. The 
other two groups began with the Single Set 
Series, proceeding to the Multiple Set Series. 
Two lists of words were, thus, presented to each 
subject: one in the Single Set Series, the other in 
the Multiple Set Series. The two word lists 
were balanced with respect to their use in these 
Series and in the first and second half of the ex- 
periment. The two lists of color-neutral pairs 
and the list of food-neutral pairs are contained 
in Table I. A summary of the experimental 
design follows: 


First Half 
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regular stimulus pairs was presented once at 30 
msec. below threshold, once at 20 msec. below, 
twice at 10 below, twice at threshold, twice at 
10 msec. above threshold, once at 20 above 
threshold, once at 30 above threshold, continuing 
in steps of 10 msec. up to 100 msec. above 
threshold. If S had not yet recognized the food 
or color word, exposure times were raised in 
steps of 20 msec. up to 200 msec. above threshold, 
and then in steps of 50 msec. to 1000 msec., at 
which point testing was discontinued if recogni- 
tion had not occurred. A five-min. rest period 
was given between the two series. The time 
required for the experiment varied from one to 
two hours. 


REsuLTs AND Discussion 


Recognition thresholds.—What char- 
acteristics of the S’s perceptual be- 
havior were affected by variation in 
the multiplicity of set? First con- 
sider recognition thresholds under 
single and multiple set.* 

Mean recognition threshold under 
multiple set is 228 msec.; in the Single 
Set Series it is 191 msec. 

Threshold of recognition is a joint 
function of several factors, over and 
above set. Reviewing the design, 


Second Half 


Group 1: Multiple Set (Color List 1 + Food List)}—Single Set (Color List 2) 


Group 2: Single Set (Color List 1) 


—Multiple Set (Color List 2 + Food List) 


Group 3: Multiple Set (Color List 2 + Food List)—Single Set (Color List 1) 


Group 4: Single Set (Color List 2) 


—Multiple Set (Color List 1 + Food List) 


TABLE I 
or Worp Pairs Empioyep 1n THE SinGLE Set anp Muttipce Set Series 


Color List 1 
yellow-fiddle 
azure-ingot 
brown-sword 
cerise-basket 
russet-statue 
auburn-poster 
mauve-abode 
lilac-match 


The experimental session began with the suc- 
cessive tachistoscopic presentation of three neu- 
tral sample word pairs at exposures of 20, 30, 40, 
50 msec., and so on, in steps of 10 until one word 
of the pair was correctly recognized. The aver- 
age of the three recognition times was employed 
as areference threshold. Thereafter, each of the 


Color List 2 
beige-check 
green-paper 
maroon-length 
black-shift 
purple-hunter 
violet-public 
indigo-fringe 
khaki-swamp 


Food List 
bacon-house 
omelet-window 
coffee-rocket 
salad-width 


Threshold Determination List 
letter-hatbox 
mouse-horse 
animal-twenty 


we note the following additional po- 
tential sources of variance: relative 


3 A comparison was made of color words only, 
since these occurred in both the Multiple Set 
Series and Single Set Series, whereas food words 
appeared in the former only. 
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difficulty of the word pairs, the effect 
of practice in going from the first to 
the second half of the experiment, the 
perceptual acuity of the individual 
subjects, and the various interactions 
of these sources. To make the best 
possible estimate of the effect of these 
variables, an analysis of variance was 
carried out yielding the F-ratios of 
Table II (based on an error variance 
of 196.91 with 228 degrees of freedom). 
Other things being equal, these re- 
sults indicate that threshold of rec- 
ognition varies significantly with mul- 
tiplicity of set. Under multiple set, 
the critical color words are recognized 
at considerably longer exposure times. 
Speed of recognition is one of the 
characteristics of perceptual behavior 
influenced by singleness and multi- 
plicity of set. 


TABLE II 


ANALysIs OF VARIANCE OF RECOGNITION 
THRESHOLDS 








Source of Variance df F-ratio P 





Multiplicity of set 
First and second halves 
Different words 
Subjects 
Interactions 
Ss X Set 
Halves X Set 
Words X Set 


5.05 
2.03 
4.02 
10.74 


-O1-.05 
> .05 
-O1-.05 
<.01 


1.87 
4.18 
1.17 


OI-.05 
-O1-.05 
>.05 














There are other significant deter- 
minants of the recognition threshold. 
Some words are more difficult to rec- 
ognize than others, quite apart from 
the set under which the perceiver 


works. Different degrees of famili- 
arity with the stimulus words were a 
factorhere. The common color words 
such as black, brown, and green were 
recognized more quickly than some 
of the more esoteric ones, e.g., indigo 
and azure. Individual differences in 
visual capacity as measured by thresh- 
old level were considerable. The 
large variations in general threshold 


LEO POSTMAN AND JEROME S. BRUNER 


levels may be due partly to visual 
acuity, partly to differences in Ss’ 
ability to perform the type of per- 
ceptual task which the experiment 
required. 

Why should multiple set raise the 
threshold of recognition? One sug- 
gestive hypothesis is provided by an 
analysis of the interaction between set 
and practice (first and second halves). 
The mean recognition thresholds for 
the two halves of the experiment 
under the two instructions are as 
follows: 


Groups I and 3 
Multiple Set, first half—222 msec. 
Single Set, second half—163 msec. 
Groups 2 and 4 
Single Set, first half—2z20 msec. 
Multiple Set, second half—233 
msec. 


Multiple set serves strikingly to in- 
hibit the effects of practice. When 
the S works under a single set in the 
second half of the experiment, his 
performance shows the effects of 
practice; his average threshold is 
considerably lower. If the multiple 
set comes during the second half of 
the experiment, it not only prevents 
such practice effects from appearing, 
but even leads to an increase in 
average recognition time. 

It is interesting to note in passing 
that our Ss did not respond uniformly 
to change in set, as is evidenced by 
the significant interaction of Ss X Set. 
This finding suggests important indi- 
vidual differences in perceptual flexi- 
bility, i.e., in the ability to adapt 
quickly to a new perceptual situation. 

Mediating Mechanisms.—On an 
earlier page we emphasized the im- 
portance both of delineating those 
characteristics of perceptual behavior 
which are sensitive to changes in set 
and of formulating the mechanisms 
through which set controls perception. 





MULTIPLICITY OF SET 


To do these things, one must consider 
not only the final phase of perception, 
correct recognition, but also the 
pattern of acts leading to perceptual 
success or failure. Just as it is often 
important in the analysis of the learn- 
ing process to study the behavior 
preceding successful performance, so 
in perception one must examine the 
attempts at perceiving made by the 
person before correct recognition 
occurs. It is from an analysis of 
such ‘pre-recognition hypotheses’ that 
we can arrive at a fuller description of 
the influence of set and the mecha- 
nisms responsible for observed 
changes. To this we turn next. 
Speed of perceptual function.—Did 
multiplicity of set influence, for ex- 
ample, the time and nature of the 
first attempt at recognition of meaning? 
An analysis of variance of the times of 
exposure required before the S haz- 
arded a first ‘meaningful’ hypothesis 
reveals, indeed, that change in set 
had an effect. By ‘meaninfgul’ hy- 
pothesis we mean, of course, a word 
with dictionary meaning, whether it 
be the correct one or not, as con- 
trasted with parts of words, single 
letters, the failures to attempt an 
interpretation. Under multiple set 
S is significantly retarded in his 
ability to make meaningful interpreta- 
tions of the words presented to him. 
The mean exposure time required 
before a meaningful hypothesis ap- 
pears under multiple set is 136 msec., 
as contrasted with 130 msec. in the 
Single Series. Though small, the 
difference is consistent, and analysis 
of variance reveals it to be statistically 
significant at the five percent level. 
In several other respects, an analy- 
sis of the exposure levels at which 
meaningful hypotheses first appeared 
offers results parallel to those obtained 
in the analysis of recognition thresh- 
olds. Again, Ss differed significantly 
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one from the other, and different 
words proved unequal in their capa- 
city for evoking a quick meaningful 
hypothesis. As before, the interac- 
tion of Ss and set yielded a significant 
F-ratio, underlining once more the 
importance of individual differences 
in reaction to change in set. 

In one respect, however, the rec- 
ognition thresholds and first meanin- 
ful hypotheses showed an interesting 
discrepancy. While in the case of 
recognition thresholds, multiple set 
prevented practice effects from ap- 
pearing, this was not the situation 
where the first meaningful hypothesis 
was concerned. Here, the practice 
effect manifested itself in both the 
Single Set and Multiple Set Series. 
In both series, practice served effec- 
tively to reduce the amount of time 
necessary for the appearance of a 
first meaningful hypothesis in the 
second half of the experiment. It 
may well be that after the first series 
all Ss gained in confidence and were 
ready to hazard an early guess at the 
full meaning of the stimulus word. 

With the one exception cited, then, 
we may say that the conditions which 
raise recognition thresholds (or, bet- 
ter, decrease perceptual sensitivity) 
also serve to retard the early appear- 
ance of meaningful hypotheses. 

Both first attempts at meaningful 
hypotheses and correct recognitions 
are retarded under multiple set. 
Perception seems to function at a 
slower and less efficient pace. To 
obtain a clearer picture of the changes 
responsible for this slowdown of per- 
ceptual function under multiple set, 
we turn to an examination of the 
nature and content of the ‘pre-rec- 
ognition hypotheses.’ 

Congruence of hypotheses.—Consider 
the first meaningful hypotheses haz- 
arded by Ss. Recall that S is in- 
structed or tuned to see the color 
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word in the Single Set Series, the 
color or food word in the Multiple 
Set Series. What he sees first will be 
a measure of the extent to which his 
set is operating selectively. If his 
first perception of the stimulus is both 
incorrect and not in the area of 
meaning prescribed by the instruction, 
we may assume that his selective set 
is not sharply tuned. Does multiple 
set have the effect of reducing the 
sharpness of ‘set tuning?’ To an- 
swer this question, first meaningful 
hypotheses were classified as con- 
gruent or incongruent with S’s set. 
Under single set conditions, all hy- 
potheses denoting color were classified 
as congruent; under multiple set, 
hypotheses symbolizing food or color. 
The results of this analysis are in- 
structive. The percentages falling 
in the two categories are as follows: 
Single Multiple 
Set Series Set Series 
72.6 64.8 
27.4 35-2 


Congruent hypotheses 
Incongruent hypotheses 


Under multiple set, fewer congruent 


first hypotheses occur. What the 
S sees in his earliest attempt at rec- 
ognition is more frequently something 
not within the area of meaning pre- 
scribed by his set. The difference is 
significant at the five percent level. 
Interpretation of this finding leads 
us to an important characteristic of 
set. Under a single set, S is ready to 
see a narrowly circumscribed range 
of objects. By the same token, he is 
able to exclude potential percepts in- 
congruent with his set. Anyone who 
has looked for a specific name in a 
telephone directory will readily recall 
how efficiently one can exclude from 
perception the names one is notlook- 
ing for. On the other hand, if one 
has to look for several names, he has 
to stop and consider various irrelevant 
entries much more frequently. The 
greater the multiplicity of the set, the 
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greater the range of objects relevant 
to the set, and the less the capacity to 
exclude incongruent potential per- 
cepts. 

Reduction in sharpness of tuning.— 
What we have been referring to as 
sharpness of tuning varies, of course, 
along a continuum. The continuum 
may be defined by two ideal poles: 
vigilance * and anoesis. In a state of 
complete vigilance, and we adopt the 
term from Head (6), the organism is 
maximally tuned to what is relevant 
to its prevailing set. There is, under 
a condition of vigilance, a minimum 
of intrusion of stimulus characteristics 
irrelevant to the prevailing set. At 
the other ideal pole, anoesis (our in- 
debtedness for the word is to G. F. 
Stout (10)), there is a minimum of 
tuning, and one characteristic of the 
stimulus field is as relevant as another. 
Anoesis is tantamount to that un- 
likely state in which the organism has 
no set at all. Our point is, briefly, 
that the greater the multiplicity of a 
set, the more it moves toward an anoetic 
condition. 

One may object at this point that 
multiplicity of set need not mean re- 
duction in the sharpness of tuning. 
Is it not possible to be sharply tuned 
to two or more classes of events? 
Sharpness of tuning depends, how- 
ever, not only on the readiness to per- 
ceive specific classes of events but 
also on the ability to exclude all that 
is irrelevant to the set. Multiplicity 
reduces sharpness of tuning by reduc- 
ing the range of excluded objects and 
by lowering the speed with which the 
decision between relevant and irrele- 
vant ‘events can be made. Under 
single set, our Ss could exclude every- 
thing that was non-color. Under 
multiple set, everything that was 
either non-color or non-food was ir- 


‘For a further illustration of vigilance, see 2, 
3, and 9g. 
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relevant. Ss had to shift back and 
forth between two choices and, in the 
process of doing so, admitted more ir- 
relevant percepts than those who oper- 
ated under a single set. 

Inhibition —Another evidence for 
the inefficiency of perception under 
multiple set is the greater frequency 
of failures to perceive anything what- 
soever upon the presentation of the 
stimulus. A_ tabulation of such 
‘blanks’ under multiple and single set 
conditions was made. Ss, we find, 
were significantly more likely to ex- 
perience blanks during the Multiple 
Set Series. The mean difference in 
the percentage of exposures under the 
two conditions on which such blank 
perceptions occurred was 2.6 percent. 
Though small, the difference was 
highly consistent and yielded a 
P-value at the one percent level. The 
higher the frequency of blank per- 
ceptions, the less other types of hy- 
potheses can, of course, be used. A 


large number of blank perceptions 
occurs at the expense of fragmentary 
hypotheses (reports of single letters) 
and recognitions of neutral words. 
The greater frequency of failures to 
perceive may be regarded as a sign of 


inhibition under multiple set. Multi- 
ple set, by mobilizing two competing 
classes of response, yields behavior 
conforming to the classical picture of 
inhibition. Inhibition, as Sherrington 
and a generation of neurophysiologists 
have since taught us, is a state of 
affairs in which response systems 
sharing a final common path prevent 
each other from functioning. With- 
out any excursus into mythical neurol- 
ogy, we nonetheless suggest that 
multiple set may result in the same 
state of affairs—inhibition due to 
competing perceptual responses. 
Relation of multiple set to conflict.— 
The inhibitory effects of multiple set 
suggest that in multiplicity of set 
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may be found the seeds of mental con- 
flict. A multiple set may well be 
directed toward classes of events 
whose simultaneous presence in per- 
ception is impossible. In a measure, 
this was true in our experiment, for 
color words and food words could not 
be simultaneously present. But our 
Ss were free to alternate between color 
hypotheses and food hypotheses and 
did, in fact, do so. They were 
sensitized to both, and if they failed 
in an attempt to see one, then they 
would try to see the other. Some- 
times, however, such alternation may 
be impossible and perception of one 
class of events must be achieved at the 
expense of other possible perceptions. 
The motorist, caught in a traffic snarl, 
who can attend only to traffic and 
not to the scenery, illustrates this 
situation. If multiple set leads to a 
choice between alternative and mutu- 
ally exclusive perceptual activities, the 
perceiver may find himself in con- 
flict, especially if motivation to pursue 
both activities is intense. It is be- 
cause our Ss could shift back and 
forth between alternative hypotheses 
that their multiple set did not result 
in conflict. Where multiple set nec- 
essitates a decision between mutually 
exclusive perceptions rather than an 
alternation between them, conflict 
may result. 


Tue Rove or MULTIPLE 
Set 1n PERCEPTION 


In our discussion of set as a deter- 
minant of perceptual selectivity, we 
have, as the reader has doubtless 
sensed, been influenced in no small 
measure by Egon Brunswick’s gen- 
eral theory of perception (4). Bruns- 
wik’s conceptualization puts major 
emphasis on the role of intention in 
the construction of a_ perceptual 
world by the organism. Through 
the utilization of what is immediately 
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given in perception (Gegebenheiten) 
the organism aims at, or intends, a 
perceptual object (Gegenstand). The 
organism, to be sure, cannot and does 
not always attain the object it in- 
tended, and perception is often a 
compromise formation (Zwischenge- 
genstand), a compromise between the 
intended object and other possible 
perceptions of the situation. In- 
structing the organism to ‘set’ himself 
to see certain things is, then, an 
experimental manipulation of inten- 
tion. In essence, we have been deal- 
ing here with the complication or 
multiplication of intentions. In our 
experiment, perception of a color 
word or food word represents S’s 
intention. Incongruent and fragmen- 
tary hypotheses are compromise for- 
mations (Zwischengegenstande). 

It has been our aim in this paper to 
analyze the perceptual consequences 
of multiplicity of set. Our experi- 
mental conditions were such that 
multiplicity of set could not lead to 
fusion of the two classes of objects 
relevant to the set. Rather S was 
continually forced into quick shifts 
between alternative perceptual 
choices. What general principles 
over and above compromise formation 
can be invoked to describe the effects 
of multiple intention or set? 

1. On the most general level, we 
may say that multiplicity of set or 
intention impairs the efficiency of 
perceptual selectivity. Not only is 
the S under multiple set slower in 
recognizing the stimulus (or, in Bruns- 
wik’s terms, slower in attaining the 
intended object), but he also fails to 
benefit from practice. Multiple set 
has the function of disrupting the 
normal course of perceptual learning. 

2. Multiple set also serves to slow 
down the speed with which the S 
begins his attempts at the meaningful 
interpretation of stimuli. Under 
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multiple set, the search for meaning 
is delayed, is more difficult, and is 
later in reaching its object. 

3. Not only is the search for mean- 
ing slower; it is characterized by a 
more diffuse orientation toward ‘in- 
tended’ environmental objects. As 
we have put it, set is less sharply 
tuned when it is multiple. 

4. In its most extreme form, the 
slowdown of perceptual function under 
multiple set may result in the inhibi- 
tion of meaning, resulting in per- 
ceptual ‘blanks.’ 

Finally, a methodological conclu- 
sion. By varying set systematically 
within one dimension—multiplicity— 
we have been able to demonstrate 
lawful variations in perceptual be- 
havior. This is but one example of 
dimensional analysis of the role of set 
in perceptual selectivity. The di- 
mension of multiplicity itself must be 
subjected to considerably more quan- 
titative and qualitative investigation. 
There remains largely unexplored the 
analysis of such other dimensions of 
set as intensity, conflict among sets, 
and temporal stability. How do they 
affect perception? Until such ques- 
tions are answered, the concept of set 
must remain a nebulous residual 
category. Indeed, until such a time 
as the contribution of the organism to 
perception can be described as rigor- 
ously as we now describe the char- 
acteristics of the stimulus, perceptual 
theory will remain inadequate. 


(Manuscript received June 28, 1948) 
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ON THE APPLICATION OF ANALYSIS OF VARIANCE 
TO GSR DATA: I. THE SELECTION OF AN 
APPROPRIATE MEASURE * 


ERNEST A. HAGGARD 
The University of Chicago 


If an investigator uses factorial 
design and the analysis of variance, 
and if he expects his ‘results’ to be 
valid, it is necessary that his data 
meet the basic assumptions under- 
lying these techniques. Briefly, the 
data should possess or approximate 
the following characteristics: (a) ad- 
ditivity, (b) normality, (c) homo- 
geneity of the variances, (d) inde- 
pendence of the means and variances, 
and (e) randomness. An examina- 


tion and dicsussion of these assump- 
tions can be found in papers by vari- 
ous authors (e.g., I-5, 10-13, 16-22, 
25, 29, 30, 34, 40). 

The consequence of a failure to 


satisfy these assumptions may be a 
drastic decrease in the validity and 
efficiency of the analysis of variance 
test, and hence a corresponding over- 
or under-estimate of significance 
values. Cochran (12) and others 
have discussed in some detail the 
effects of non-additivity, non-nor- 
mality, heterogeneity of variances, 
and correlation between means and 
variances, and suggested possible cor- 
rections. In some cases (e.g., moder- 
ate deviations from normality), the 
distortion may be negligible. But 
in other instances (e.g., with grossly 
unequal units of measurement or with 
marked heterogeneity among the error 
estimates), the conclusions drawn may 


* The data utilized in these studies were col- 
lected by Nathan W. Shock, under the direction 
of Harold E. Jones, between 1933 and 1938 at 
the Institute of Child Welfare, University of 
California, Berkeley. The present author wishes 
to express his gratitude to Harold E. Jones for 
making possible the two studies in this series. 


be quite misleading, and some form 
of correction is necessary (cf. 1-13, 
17, 19-22, 29, 37, 40). 

These assumptions apply to the 
formal characteristics of any set of 
measures which one wishes to analyze 
by the analysis of variance technique 
and the F-test of significance. With 
particular reference to galvanic skin 
response (GSR) data, the methods of 
quantification are numerous and di- 
verse, and several measures or scales 
have been proposed. For the most 
part, these measures have been used 
rather indiscriminately, on the ground 
that one is about as good as another. 
Only recently have attempts been 
made to analyze the basic character- 
istics of the various frequently used 
scales (20, 21,31), and the present 
paper is a further step in this direction. 

More specifically, the purpose of 
the present paper is threefold: (a) to 
examine in detail four GSR measures 
in terms of the assumptions under- 
lying the analysis of variance, and, on 
the basis of empirical data, (b) to 
select the one best suited for the 
analysis of the extensive GSR re- 
cords of the California Adolescent 
Growth Study (27, 28), and finally, 
(c) to consider the problem of a 
generally acceptable measure for the 
quantification of the GSR. 


METHOD AND PoRCEDURE 


A pparatus.—The changes in the S’s resistance 
(GSR’s) were obtained from a continuous photo- 
graphic record of the deflections of a Leeds- 
Northrup (No. 4799-A) galvanometer. The cur- 
rent of about 7.5 microamps through the S was, 
for all practical purposes, constant throughout 
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the experiment. The electrodes were placed on 
the palm of the left hand and left forearm of 
the S. For a complete description of the ap- 
paratus and procedure used in this study, see the 
papers by Jones (26, 27) and Shock (35). 

Subjects—so0 boys and 50 girls from the 
California Adolescent Growth Study served as 
Ss. The data reported in this paper were col- 
lected when they were, on the average, 13.5 
years of age and again four years later. Since 
the sample was a grade group, the age range at 
each testing was approximately two years, with 
a sigma of .5 years. 

The stimulus list—Stimulus words were used 
which had been rated by the Ss as being pleasant, 
indifferent, or unpleasant in emotional tone be- 
fore each of the two testings. They were as 
follows: 


Stimulus words at 13.5 years: Pleasant (ice 
cream, sweetheart, funny paper, vacation 
day), indifferent (tablecloth, lamp shade, waste 
basket, shirt sleeves), unpleasant (cry baby, 
castor oil, bad habit, dumb bell). 

Stimulus words at 17.5 years: Pleasant (hug 
mother, beautiful girl, kiss lips, strong muscles), 
indifferent (book case, wall paper, cloudy day, 
brush clothes), unpleasant (bloody sore, mean 
teacher, no money, secret worries). 


In each testing, the position of each stimulus 
word in the list was systematically rotated from 
S to S to preclude any bias that might result 
from the fixed position of a word in the series. 
The present study is based on a total of 2305 
responses, 1440 from the 13.5 and 865 from the 
17.5 age samples.! 

Measures.—The four GSR measures or scales 
to be examined are: 

1. Resistance change.—This measure (ohms 
GSR) is obtained by determining the absolute 
difference between the resistance level of the S 
at the time the stimulus is presented and the 
resistance at the maximal deflection a few sec- 
onds later. This is perhaps the most frequently 
used measure—probably because the data are 
recorded in this form and no transformation 
is required. We know from earlier studies (20, 
21) that this measure is characterized by non- 
additivity, non-normality, by lack of independ- 
ence and by heterogeneity of the variances for 
different resistance levels. However, since this 
measure is so frequently used, a re-analysis of its 
characteristics was made in the present study. 

2. Conductance change.—This measure is ob- 
tained by determining the absolute difference 


1 Responses to four ‘buffer’ words were also 
available for the 13.5-year sample, and are in- 
cluded in the tests of additivity and normality. 
These stimulus words are: sledge hammer, bath- 
robe, river boat, and fly catcher. 
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between the reciprocal of the resistance level at 
the time the stimulus is presented and the re- 
ciprocal of the resistance at the maximal deflec- 
tion. This measure was suggested by Darrow 
(14) on the ground that changes in conductance 
are proportional to changes in physiological ac- 
tivity (i.e., microscopic perspiration). On the 
basis of earlier data (20), the conductance change 
measure was shown to lack the quality of addi- 
tivity, but the other assumptions were not tested 
at that time. 

3. Log resistance change.—This measure is ob- 
tained by determining the absolute difference 
between the log of the resistance level at the time 
the stimulus is presented and the log of the re- 
sistance at the maximal! deflection. It was sug- 
gested in an earlier paper (20) that this measure 
should possess the quality of additivity, but no 
test was made at the time, and no information 
was available regarding the other assumptions 
underlying the analysis of variance test. 

4. Log conductance change.—This measure is 
obtained by taking the log of measure No. 2, the 
conductance change. If desirable, it may be 
multiplied by a constant to give units of a con- 
venient size. (The measure used here is not to 
be confused with log conductance change as pro- 
posed by Darrow (15), which is obtained by de- 
termining the absolute difference between the 
log of the conductance at the time the stimulus 
is presented and at the maximal deflection.) 

Data.—A previous analysis of these data (33) 
showed a shift in both the mean resistance level 
and in the distribution of level scores over the 
five-year period. The two extreme age samples 
were used in order to select the measure which 
was the most generally useful. Each of the four 
measures was applied to all responses for these 
two years. 


REsuLTs: TESTING THE 
Four ScALEs 


Before examining the properties of 
the four measures tested in this study, 
one characteristic of the data should 
be noted. This is the fact that the 
distributions of resistance levels at 


which the GSRs were recorded differ 


for the two age samples. In Fig. 1 
we see that Ss in the 13.$-year 
sample showed a much wider range 
of levels than did Ss in the 17.5-year 
sample. The resistance level at 
which the GSR is recorded is of 
fundamental importance in deter- 
mining the nature of several of the 
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transformations, particularly the re- 
sistance and conductance change 
measures (22). Hence, it seems justi- 
fied to infer that any measure which is 
adequate for these extreme samples 
may also be applied to the intermedi- 
ate age sample distributions. 

In examining the four scales, sepa- 
rate analyses will be made for the two 
age groups. In some cases, however, 
finer breakdowns (e.g., sex differ- 
ences) will be presented if they seem 
justified. 

1. Additivity 


(real effects should 
be additive).—One of the chief char- 
acteristics of an additive scale is that 
it possess equal intervals over its 


total range (cf. 36). The quality of 
additivity results in an increase in 
the generality of the inferences that 
may be drawn from the observed 
sample means (16, p. 11), whereas 
non-additivity makes the error vari- 
ance larger and more heterogeneous, 
with a consequent loss or distortion of 
information.” 


2 Such distortions usually are not serious, ex- 
cept in cases where the pooled error variance is 
applied to individual pairs of treatments. Ina 
subsequent paper (22) based on the present data, 


The relative frequency of occurrence of GSRs at a given resistance level for the 13.5 and 
The plotted ordinate values for each scale sum to 100 percent. 


The task of setting up an additive, 
or equal interval, scale is relatively 
easy in some cases. If one were de- 
veloping certain physical scales, the 
quality of additivity could be attained 
by establishing an equivalence be- 
tween ‘how much one puts in’ and 
‘how much one gets out.’ In trying 
to develop an equal interval scale for 
the GSR, however, no such direct 
test is possible. The only available 
criterion is the relation between 
changes in microscopic (palmar) per- 
spiration and conductance scores (14). 
However, this does not really help us 
to develop a scale which is to measure 
the potency of verbal stimuli, since 
we do not know the relation between 
these physiological changes and their 
psychological correlates. And even 
if we did have this information, we 
would probably find that the scale 
some illustrations of the effects of non-additivity, 
etc., in these GSR measures will be given. In 
one instance, the F-values for the same experi- 
mental treatments, and same raw data, ranged 
from 1.71 to 118.11, and in another from 4.83 to 
55.59 when different scales were used. In view 
of such discrepancies, there is lit*!e question that 
violation of such assumptions as additivity may 
result in quite invalid conclusions. 
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Fic. 2. Relation between mean GSR and 
resistance level for each of the four measures for 
the 13.5 age sample. N = 90.6 percent of the 
total sample. 

The ordinate values are adjusted for the fol- 
lowing two reasons: First, the raw numerical 
values of the four measures are so divergent that 
they could not te plotted on the same graph. 
To adjust for this, the mean GSR values for each 
measure were converted into the percent of the 
mean of all GSRs for that measure. Second, 
there were occasional marked sampling vari- 
abilities from level to level. To correct for this, 
the four adjusted measures at each resistance 
level were averaged, and each measure was 
plotted as the percent of the four measures. 


would need to be transformed in 
some manner, since most psycho- 
physical relations are not additive or 
linear. 

In view of these difficulties, it 
seemed necessary to take an indirect 
approach—namely, to eliminate the 
variables which are known to effect 
a marked lack of additivity in the 
scale. Probably the most important 
variable is the strong positive rela- 
tionship between the resistance level 


381 


at which the GSR is recorded and the 
average size of the GSR (20, 21, 39). 
More specifically, the method of ap- 
proach used here is based on the as- 
sumption that under conditions of a 
‘standard stimulus,’ the same average 
response should obtain regardless of 
the resistance level at which the re- 
sponse occurs. That is to say, when 
Ss and stimuli are randomized, and 
only the relationship between the 
level of skin resistance and the mean 
GSR is considered, the effects of 
differential stimulus strength and 
subject reactivity should be averaged 
out. In the case of an equal interval 
scale, then, the mean GSR would not 
be a function of the resistance level, 
but rather would be shown as a 
straight line of zero slope. 

Fig. 2 shows the general relation- 
ships obtaining between the mean 
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Fic. 3. Relation between mean GSR and 

resistance level for each of the four measures for 

the 17.5 age sample. N = 85.7 percent of the 

total sample. The corrections made for Fig. 2 

were also made for Fig. 3. 
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TABLE I 


CoEFFICIENTS OF CORRELATION AND VARIATION INDICATING THE RELATIONSHIP BETWEEN 
THE ResisTANCE LEVEL AND THE Averace GSR VALuveEs or THE Four 
MEASURES FOR THE Two AGE SampLes * 








Resistance 


Log 


Conductance Conductance 


Log 
Resistance 





r .99** 
CV 29.1 


Fe 
42.5 


r 
CV 








—.88** 
36.1 


—.16 
17.9 


—.49* 
7.1 


—.go** 53° 


—.07 
69.3 8.5 


18.3 














* The correlation coefficients marked with a double asterisk (**) fall at the .o1 level, and those 
marked with a single asterisk (*) at the .o5 level of significance. 
The higher correiations for the log resistance than for the log conductance change are due to the 


small GSR variance of the former (see Figs. 2 and 3). 
correlations also indicates the lack of any consistent trend in this measure. 


The opposing direction of the log resistence 
The N in each of the 


tables refers to the number of observations used to compute the tabulated values. 


GSR and the resistance level for each 
of the four measures for the 13.5-year 
sample, and Fig. 3 presents the cor- 
responding data for the 17.5-year 
sample. In view of the data pre- 
sented in these figures, we may say 
that: (a) the size of the average 
resistance change GSR increases as 
the resistance level increases, (b) the 
size of the average conductance change 
GSR decreases as the resistance level 
increases, (c) no systematic trend 
exists in the case of the log resistance 
and the log conductance change 
GSRs, and finally, (d) none of the 
four measures is completely satis- 
factory as an equal interval scale, 
since all show at least some degree of 
fluctuation in the size of the mean 
GSR from level to level. However, 
the log resistance and log conductance 
change measures are the most satis- 
factory, since neither shows a sys- 
tematic shift over the range of levels. 

These’ results may be expressed 
differently by comparing, first, the 
coefficients of correlation between 
resistance level and mean GSR to 
indicate the degree of relationship, 
and second, the coefficients of vari- 
ation of the mean GSR values, to 
indicate the consistency of these 
measures along the total resistance 


range. These values are presented in 
Table I. It is readily apparent that 
the larger the values of r and CV, the 
less does the particular measure pos- 
sess the quality of additivity, or of 
being an equal interval scale. Con- 
sequently, from the data presented in 
Figs. 2 and 3 and in Table I, the log 
resistance and log conductance change 
measures aré the most satisfactory in 
terms of the criterion of additivity.* 

2. Normality (the scores should be 
normally distributed).—Most of the 
studies investigating the assumptions 
underlying the analysis of variance 
and the F-test have been concerned 
with the ‘effects of deviations from 
normality (cf. 24). In general, one is 
not justified in using the F-test when 
the population is markedly non-nor- 
mal (4), since in such cases, “the 
efficiency of the analysis of variance 
methods has been found to vary from 
100 percent to zero” (12, p. 25), al- 
though moderate deviations from 


3 The effects of the initial resistance level and 
variance arising from non-additivity may be 
removed statistically by using covariance anal- 
ysis and analyzing the reduced variance of the 
resistance (or log resistance) at the maximal de- 
flection. But this becomes a rather elaborate 
procedure when several variables are involved in 
a complex experimental design. 
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normality are generally of negligible scores. for the 13.5-year sample is 
significance (18, 34). presented, and Fig. 5 shows the cor- 

In the case of the present data, responding data for the 17.5-year 
there is no question as to which of the sample. In both instances, the log 
four measures most nearly satisfied the conductance change measure provides 
requirement of normality. In Fig. 4, the distribution which is most nearly 
the total distribution of non-zero normal. Consequently, this measure 
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Fic. 4. The total distribution of non-zero scores for each of the four measures for the 13.5-year 
sample. 

The ordinate values are the relative frequency of occurrence of a response of a given magnitude 
at each interval midpoint. The plotted ordinate values of each of the four measures sum to 100 
percent. 

The abscissa values are adjusted to correct for wide differences in the raw score values of each 
of the four scales so that they may be plotted on the same graph. This was done by dividing each 
raw-score midpoint by the highest midpoint for each measure. This reduction of the four abscissas 
to a common base retains the form of the distributions. The percent of zero scores (i.e., no recorded 
response) for the four measures are as follows: resistance (27.2%), conductance (27.6%), log resistance 
(29.0%), and log conductance change (29.8%). 
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Fic. 5. The total distribution of non-zero scores for each of the four measures for the 17.5-year 
sample. The same procedures of plotting the data were used as in Fig. 4. The percent of zero scores 
for the four measures are as follows: resistance (29.5%), conductance (29.5%), log resistance (30.2%), 
and log conductance change (30.2%). 
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should be selected in terms of the 
criterion of the normality of the total 
array of scores. 

3. Homogeneity (the variables, and 
their error estimates, should have a 
common variance).—The presence of 
unknown heterogeneity among the 
variances, or error estimates, may 
distort one’s findings as a result of 
an undetermined loss of efficiency 
and sensitivity in the estimates of 
treatment effects (cf. 12). 


TABLE II 


CoeEFFICIENTS OF VARIATION OF THE 
VARIANCES FOR THE Two 
Ace SAMPLES 








Log 
Resis- 
tance 


Conduc- 
tance 


Resis- 


N tance 


Age 
Sample 





(24)| 108.5 


101.7 


83.2 
83.6 


58.3 


13. 
1 Q1.4 


3 
7-5 | (24) 

















A test of the homogeneity of vari- 
ances may be made in several ways. 
One casually descriptive method is to 
compute the coefficients of variation 
of a set of variances. It will be re- 
called that there were 12 stimulus 
words to which both boys and girls 
responded in each of the two age 
samples. Using the responses to each 
word as a ‘sample,’ and keeping the 
data of the boys and girls separate, 
there are 24 such samples available 
at each of the two age levels.4 These 
coefficients of variation are given in 
Table II, from which it is readily 
apparent that for both age samples 
the variability among the log con- 
ductance change variances were 
less than among the other measures. 

A further test was made, however, 


4 For the 13.5-year data the average N in each 
such ‘sample’ was 40.6 for the boys and 31.8 for 
the girls; for the 17.5-year data, the average N 
was 47.8 for the boys and 42.3 for the girls. 
These are the N’s on which the respective vari- 
ances are based. 
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to determine more precisely the extent 
of the homogeneity of the variances. 
The statistic used is M (23), and its 
purpose is to test the hypothesis that 
the samples are from populations 
having equal variances. Due to the 
nature of this test, if the computed 
value of M is larger than the tabulated 
value (cf. 38), one may reject the 
hypothesis that it would occur if the 
samples are drawn from populations 
of equal variance. 

The computed values of M for the 
13.5-year sample are presented in 
Table III, and corresponding values 
for the 17.5-year sample in Table IV. 
These results show that for the three 
measures of resistance change, con- 


TABLE III 


Va.ues oF M as a Test or Homoceneity 
OF VARIANCES FOR THE 13.5-YEAR SAMPLE * 








Conduc- 


N tance 





(24) 412.6** 
(12) 
(12) 
(8) 
(8) 


U-words | (8) 




















*The values in this table marked with a 
double asterisk (**) fall at the .o1 level. The 
tables of M do not extend beyond N = 15. For 
our total group, it was necessary to compute the 
statistic Z;, and to refer to the tables of Ly 
prepared by Nayer (32). 


TABLE IV 


Vautues oF M as a Test or HomoceNeity 
oF VARIANCES FOR THE 17.5-YEAR SAMPLE * 








Log 
Conduc- 





U-words 




















* See note to Table III. 
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ductance change, and log resistance 
change, all of the computed values of 
M are significant. Thus, if we con- 
sider the responses to each word made 
by each of the various groups of 
samples for these measures, we must 
conclude that none of the groups listed 
in these tables was composed of 
samples of equal variance. For the 
log conductance change measure the 
value of M for the total group was not 
significant, so that the M’s for the 
smaller groups are not computed. 
Consequently, we may not reject the 
hypothesis that the total group of 24 
log conductance change samples are 
from populations with equal variance. 
Thus, this measure should be selected 


o* 
15.0 
125 Resistance 
10.0 

754 


50 
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in terms of the criterion of the homo- 
geneity of the variances of the samples. 

4. Independence (the variances 
should be unaffected by changes in the 
mean).—If the F-test is to be valid, 
the treatment mean squares and error 
mean squares must be independent. 
If the correlation between them is 
positive, the apparent significance 
level will be unduly exaggerated; if 
it is negative, the reverse effect will 
obtain. 

The most direct, and probably most 
satisfactory, way to determine the 
relation of the means and variances of 
the four measures is to make a plot 
of these values. If the variances are 
proportional to their means, they are 
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Fic. 6. Plots of the relationship between the means and variances of the 24 samples for each of 


the four measures for the 13.5-year sample. 


Closed circles = boys, open circles = girls. 


Because 


of the variability in the range of the mean and variance values for the two age groups, the same scale 
was not always used for the two years. However, the designated values on the two axes are equiva- 
lent. The notations in the lower right-hand corner of each plot signify the absolute value of the 
various measures, since they are derived from the original ohms resistance units. 
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Fic. 7. Plots of the relationship between the means and variances of the 24 samples for each of the 


four measures for the 17.5-year sample. 


not independent, or, in other words, 
covariance exists between the two 
measures. 

The means and variances of the 
responses (GSKs) to the 12 stimulus 
words for the boys and for the girls 
were used to make this test. This 
provides a total of 24 means and vari- 
ances for each measure for each age 
sample. The plots for the relation- 
ship between these values for the four 
measures for the 13.5-year sample are 
shown in Fig. 6, and the correspond- 
ing plots for the 17.5-year sample in 
Fig. 7. In each case, the boys and 
girls are plotted separately, since in 
several instances they tend to show 
separate trends or groupings. This 
may be due to a tendency for girls of 
these ages to react more strongly to 
such verbal stimuli, to differences jin 


Open circles = boys, closed circles = girls 


the initial resistance levels of the two 
sex groups, or to some unknown 
factors. 

In any case, it is readily apparent 
from an examination of the plots in 
Figs. 6 and 7 that (a) the degree of 
relationship between the means and 
variances of the samples and (b) the 
degree of divergance between the 
means and variances for the two age 
samples was least for the log con- 
ductance change. Consequently, this 
measure should be selected in terms 
of the criterion of the independence 
of the means and variances. 

5. Randomness (the scores should 
be distributed so that the arithmetic 
mean is an efficient estimate of the 
true mean).—The presence of ran- 
domness is implied when the other 
four conditions, or assumptions, ob- 
tain (3, p. 40). 
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6. Maximal precision.—In addition 
to the requirements discussed above, 
it is highly desirable to have a scale 
which measures what it sets out to 
measure with a maximal degree of 
precision. Like the criterion of ran- 
domness, increased precision is im- 
plied when the other four assumptions 
are satisfied, but unlike randomness, 
the relative degree of precision can be 
estimated readily. 

When the data reported in this 
paper were collected, one of the major 
interests of the investigators was to 
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to have a scale which measures the 
reactions of Ss to each stimulus word 
with a minimum of variability. If 
the differences among the responses to 
a specific word were minimal, the real 
differences between words would not 
be unduly decreased. Or, in other 
words, in making comparisons be- 
tween experimental treatments, in- 
creased precision would result in an 
increase in the ratio of the treatment 
sums of squares to the residual or 
error sums of squares—-i.e., the 
F-ratio. 


Under ideal conditions, per- 
fect precision would be characterized 
by a complete lack of variability 
among the responses of the Ss to any 


measure the differential reactions 
of Ss to words which differed in 
emotional tone. It is thus desirable 


TABLE V 


Part I 
The average coefficients of variation indicating the relative precision of the four measures for the 
13.5-year sample for boys and girls separately 








Resistance Conductance Log Resistance Log Conductance 


Stimuli 





Boys Girls Boys Girls Boys Girls Boys Girls 





P-words 139 133 286 264 162 149 gI 36 
I-words 150 128 247 193 161 169 IOI 74 
U-words 125 130 171 252 135 158 79 64 























Part II 
The average coefficients of variation indicating the relative precision of the four measures for the 
17.5-year sample for boys and girls separately 





Resistance Conductance Log Resistance Log Conductance 


Stimuli 





Boys Girls Boys Girls Boys Girls | Boys Girls 





P-words 119 116 144 116 121 60 55 
I-words 170 243 195 153 175 107 83 














U-words 144 120 219 121 171 80 59 

















Part III 


The average coefficients of variation indicating the relative precision of the four measures for both 
age and sex samples 





Stimuli Conductance Log Log 


Resistance Resistance Conductance 





P-words 
I-words 
U-words 


126.8 
172.8 
129.8 


202.5 
197.0 
190.8 


196.8 


138.8 
160.3 
142.5 


Av. of totals 143.1 147.2 
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given word. In experimental terms, 
this would be an absence of any error 
of measurement, and in statistical 
terms as an absence of the undeter- 
mined and uncontrolled variabilities 
that compose the error estimate of a 
significance ratio. 

One method of indicating the rela- 
tive precision of each measure is to 
compute the coefficient of variation 
for each word, for boys and girls, for 
each measure, and for both age 
samples. This was done, and the 
summarized findings are presented in 
Table V. 

From these data we see that the 
coefficients of variation were con- 
sistently smaller for the log con- 
ductance change than for the other 
three measures. In fact, of the 192 
CVs that were computed (i.e., for four 
measures on the 48 samples), the log 
conductance change measure yielded 
the smallest CVs in every instance. 
Consequently, this measure should be 
selected in terms of the criterion of 
maximal precision. 

In conclusion, in view of the fact 
that (a) the log conductance change 
best satisfied the assumption of ad- 
ditivity, normality, homogeneity of 
the variances, independence of the 
means and variances, and maximal 
precision, and (b) that all of these 
assumptions are rarely if ever com- 
pletely met in any one instance, it is 
apparent that this measure should be 
used to quantify the GSR data of the 
California Adolescent Growth Study. 


Discussion 


In this paper we have considered 
the fundamental assumptions under- 
lying the valid use of the analysis of 
variance and the F-test of significance. 
The purpose of this paper precludes a 
careful examination of both the the- 
oretical rationale of these’assumptions 
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and certain consequences of the use 
of these procedures with inappropriate 
measures. Adequate discussions of 
the former are available in references 
listed above, and examples of the 
latter will be presented in a sub- 
sequent paper (22). In this dis- 
cussion, however, it does seem advis- 
able to consider the log conductance 
change in terms of its general ap- 
plicability, its practicability, and its 
relation to the basic data of the GSR. 

The question may be raised as to 
whether the log conductance change 
might be applied to data from other 
experimental situations. In certain 
respects the present data differ from 
previous findings. For example, in 
Fig. 2 of this paper, the relation be- 
tween the resistance level and the 
average size of the GSR is shown to be 
a linear function. In another study 
(20, Fig. 1) it was found to be curvi- 
linear. 

This discrepancy may be due to 
differences in the apparatus and ex- 
perimental procedures.’ But regard- 
less of the reason, such differences 
occur, since practical considerations 
often preclude the possibility of main- 
taining the conditions that would 
yield data which are strictly compar- 
able from one experiment to another. 
Thus, it is highly desirable that a 
measure be sufficiently general to be 
useful in analyzing data obtained 
under various experimental condi- 
tions. 

These two sets of data are compared 
in Fig. 8: the solid lines represent the 
relationships reported in this paper, 


5 In the case of the present data, a Wheatstone 
bridge arrangement was used, and the electrodes 
were placed in the palm of the hand and the fore- 
arm of the S; in the other case, a linear photo- 
electric microammeter was used, and the elec- 
trodes were placed on the palms of both hands 
of the S. Furthermore, there were differences 
in the experimental procedures, ages of the 
Ss, etc. 
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Fic. 8. Schematic representation of the relation between the level of skin resistance and the 
average GSR when the resistance, conductance, and log conductance change are applied to data 


from two separate experiments. 


and the broken lines the ones from a 
study in which different Ss, apparatus, 
experimental procedures, etc., were 
used. In Fig. 8a, the solid line for 


resistance change is shown to be 
linear and the broken line curvilinear. 
For conductance change, the solid 
line drops off at a decelerating rate 


(cf. Figs. 2 and 3). whereas the broken 
line drops off more sharply at first, and 
then starts to rise again with an in- 
crease in the resistance level (cf. 20, 
Fig. 1). Expressed more generally, 
for resistance change, the size of the 
scale interval increases as the resist- 
ance level increases, whereas the 
reverse is true for conductance change. 
Thus, neither are equal-interval, or 
additive, scales. 

In Fig. 8b, the solid line tends to 
fall off somewhat at the upper re- 
sistance levels. This corresponds 
roughly to the situation depicted for 
log conductance change in Figs. 2 
and 3. If this curve represents the 
application of log conductance change 
to data in which the relation between 
the resistance level and the average 
GSR is linear, what will be its func- 
tion if the relation between the re- 
sistance level and the average GSR 
is curvilinear? The effect of the rise 
in the broken line depicting the con- 


ductance change in Fig. 8a should 
offset the drop in the solid line depict- 
ing the log conductance change func- 
tion in Fig. 8b. In other words, the 
log conductance change should meet 
the criterion of additivity about 
equally well for both sets of data. 

As in the case of additivity, one 
cannot make direct tests of the re- 
maining characteristics or assump- 
tions a posteriori. It is possible, 
however, to make certain inferences 
on the basis of present knowledge of 
distribution characteristics and the 
common properties of the resistance 
change measures for the two samples.® 
Because of the similarities between 
these two sets of data, it seems legiti- 
mate to infer that if the conductance 


*In bridging the gap between these two sets 
of data, the following similarities should be 
noted: (a) the average size of the resistance 
change GSR tends to increase as the resistance 
level increases in both (Figs. 2 and 3) and (20, 
Fig. 1), (b) the resistance change scores are dis- 
tributed in the form of a reverse-J-curve for both 
(Figs. 4 and 5) and (a1, Fig. 4), and (c) the means 
and variances of the resistance change scores are 
proportional in both (Figs. 6 and 7) and (a1, 
Fig. 3). Furthermore, (d) for both the Re- 
sistance and Conductance change measures in 
the present data, the total distribution of scores 
is a reverse-J-curve (Figs. 4 and 5), and the 
means and the variances of the samples are pro- 
portional (Figs. 6 and 7). 
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change had been applied to the 
earlier data, the means and variances 
of the samples would have been pro- 
portional. 

From this point we can consider the 
effects of applying the logarithmic 
transformation to data with char- 
acteristics similar to those of the 
conductance change measure reported 
in this paper. In cases where the 
means and variances of the samples 
are proportional, the logarithmic 
transformation cf the original data 
tends to produce values which are 
normally distributed (13), and also 
tends to make the means and variances 
independent (3). In additon, it tends 
to equlize the variances (19), and a 
scale which equalizes the variances 
tends to meet the assumption of ran- 
domness (3). In other words, under 
the conditions presumably present in 
both sets of data, the logarithmic 
transformation of the conductance 
change values tends to improve the 
qualities of the scale necessary for the 
valid use of the analysis of variance 
and the F-test of significance. 

The second question has to do with 
the selection of a measure which is 
most desirable in terms of its practic- 
ability. Actually, only one other 
GSR measure has been shown to 
possess the qualities necessary for 
the valid use of the analysis of vari- 
ance (cf. 20,21). This measure is: 


log resistance change GSR+é& 
level of skin resistance . 





where & is an empirically determined 
constant. 

When one is able to choose between 
two such scales, it is well to consider 
other factors, such as the ease and 
accuracy with which the transforma- 
tions can be made. The disadvan- 
tages of the measure just cited are 
two-fold: The first is the determina- 
tion of the value of k, which, ideally, 
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should be derived for each set of data, 
since k was found to vary with experi- 
mental treatments and from S to S 
(20). To obtain the value of k em- 
pirically, however, involves a good 
deal of time and effort, which gen- 
erally would not contribute toward 
answering the questions which con- 
cern the investigator. The second 
disadvantage is the division of (log 
resistance change GSR + &) by the 
resistance level. The use of log con- 
ductance change is not restricted by 
either of these disadvantages. 

Finally, over and above the practic- 
ability and utility of a measure, it is 
desirable that it also be meaningfully 
related to its basic data. In this 
connection, it seems proper to assume 
that the basic measure of the GSR is 
the change in conductance. This 


position is based on Darrow’s (14) 
finding that the conductance of the 
skin, not the resistance, tends to vary 


with the amount of microscopic per- 
spiration. 

Other things being equal, one 
should use the basic measure directly, 
if it possesses the proper characteris- 
tics; if not, it should be transformed 
in some appropriate manner which 
will remove certain undesirable quali- 
ties and ;permit the use of various 
statistical tests. In the behavioral 
sciences, the logarithmic transforma- 
tion appears to be the most widely 
useful (primarily because distribu- 
tions of such data typically show a 
strong positive skew). Data on the 
behavior and characteristics of living 
organisms, when thus converted, fre- 
quently tend to be more normally 
distributed, and also show improve- 
ments in terms of the other assump- 
tions mentioned. It is not surprising, 
then, that the logarithmic transform- 
ation, which has been used to advant- 
age with data from such fields as 
bacteriology, medicine, physiology, 
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and psychology (cf. 19), should also 
be applicable to the basic measure of 
the GSR. 


SUMMARY 


The valid use of the analysis of 
variance and the F-test of significance 
rests on the assumption that the data 
to which these techniques are applied 
possess the qualities of additivity, 
normality, homogeneity of variances, 
independence of variances, and ran- 
domness. The problem was to ex- 
amine four measures of the galvanic 
skin response (resistance change, con- 
ductance change, log resistance 
change, and log conductance change) 
in terms of these criteria. 

The data, from the California 
Adolescent Growth Study, were ob- 
tained from experiments with 50 
boys and So girls, tested at 13.5 and 
retested at 17.5 years of age. A 
total of 2305 responses to words pre- 
viously rated by the Ss as being 


pleasant, indifferent, or unpleasant in 
emotional tone were utilized. Each 
response was quantified according to 


each of the four GSR _ measures. 
Separate analyses of the data were 
made for the four GSR measures, for 
the two age groups, and in some cases 
for the two sex groups, in terms of the 
assumptions underlying the analysis 
of variance. 

From the findings of this study we 
may conclude that the log con- 
ductance change should be used to 
quantify the GSR for the following 
reasons: 

1. Of the four measures examined 
in this study, the log conductance 
change best satisfies the criteria of 
additivity, normality, homogeneity of 
variances, independence of means and 
variances, randomness, and maximal 
precision. 

2. Of the measures which have 
been shown to possess the qualities 
necessary for the valid use of the 
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analysis of variance, the log conduct- 
ance change is the most easily com- 
puted, is most general in its applicabil- 
ity, and is most directly related to the 
basic data. 


(Manuscript received July 13, 1948) 
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A FURTHER REDUCTION OF SENSORY FACTORS IN 
STEREOSCOPIC DEPTH PERCEPTION 


BY STEVENSON SMITH 
University of Washington 


An experiment previously reported 
in this Journat (1) showed that 
neither drift of the retinal image 
(during eye movements which occur 
in changes in convergence) nor sub- 
sequent fixation is essential to stereo- 
scopic depth perception. The only 
factors whose necessity was not ex- 
cluded were disparity of stimulus posi- 
tion on the two retinas and the pos- 
sible proprioceptive cues furnished by 
the reflex contraction of the rectus 
muscles. The observations that are 
about to be reported in the present 
article seem to indicate that proprio- 
ceptive cues furnished by eye move- 
ments are not necessary to stereo- 
scopic depth perception and that 
disparity of stimulus position on the 
two retinas alone is necessary. The 
conditions of the present experiment 
were as follows. 


ConDITIONS OF THE EXPERIMENT 


The 12 Ss that were used were 
either graduate students who had had 
considerable laboratory experience or 
were members of the teaching staff. 
The exposure apparatus employed has 
already been described (2). The only 
present modification in this apparatus 
is the omission of the fixation marks 
at the centers of the binocularly 
fused exposure discs. A35-mm. black 
leader strip film which may be moved 
to successive positions is punched with 
small holes that furnish for each eye 
the necessary pattern of flashes when 
the neon tube behind the film is 
flashed for 1/60th sec. In this way 
three points of light are presented 
very briefly to each eye. Each light 


in each monocular stimulation pattern 
is paired with a light in the other 
monocular pattern in that the two 
are at the same horizontal level. 
Each of the three pairs, which are 
placed in an oblique row from upper 
left to lower right, is, however, sepa- 
rated by a distance which is slightly 
different from the distances separat- 
ing the other two pairs. Thus are 
formed six possible patterns. These 
six patterns that depend upon the 
difference in distance that separates 
each of the three pairs of dots result 
in six different stereoscopic figures. 
When the lighting is made continuous, 
these three pairs of dots appear to be 
three single illuminated dots at three 


different degrees of depth. Named 
from upper left to lower right these 
three illuminated dots appear in the 
following patterns: near middle far, 
near far middle, middle far near, 
middle near far, far near middle, and 


far middle near. Each of these pat- 
terns was presented five times in 
randomized order in a series of 30 
stimuli. Before this was done, how- 
ever, each S was given practice in 
naming three small spheres at the 
ends of rods that protruded through 
a circular board six in. in diameter. 
These little spheres were pushed in 
or out to simulate the six near middle 
far patterns that were about to be 
encountered in the stereoscopic ex- 
posures. It had been found in pre- 
liminary trials that errors occurred 
because of the Ss lack of facility in 
this kind of verbal response. The Ss 
were therefore practiced with the 
model until they named each of the 
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six patterns within three seconds 
without intervening errors. After 
practice with the model they were 
shown each of the six stereoscopic 
patterns in the exposure apparatus 
with sustained illumination. Again 
they named the patterns from upper 
left to lower right in terms of near, 
middle, and far. This step in the 
practice gave the Ss the opportunity 
to focus the lenses to their individual 
correction. Next they were given 
flash exposures of each of the six 
patterns. In case of any error of re- 
sponse, the flash pattern was repeated 
until a correct response was obtained. 
Then followed the series of 30 test 
stimuli during which the Ss were not 
told whether or not their responses 
were correct. 


REsuLts AND Discussion 


If muscle sense cues were necessary 
for stereoscopic depth perception, a 


1/60th sec. exposure would not suffice 
to furnish a perception of the three 
lights as occupying the near, middle, 
and far distances. The light stimula- 
tion ends before the eyes have time 
to change their convergence even 
once, and in order to perceive all three 
degrees of distance through propri- 
oceptive stimulation there would have 
to be three successive fixations of the 
three paris of light flashes. Never- 
theless, several Ss made perfect scores 
when each of the six possible vari- 
ations of pattern was presented five 
times in randomized order in a series 
of 30 exposures. Nearly all the Ss 
agreed that the three flashes stand 
out clearly and immediately at three 
apparently different distances. It is 
not likely that after-images contribute 
to this result. Certainly no eye 
movements occur until after the 
stimulation has ended, and there is 
no reason to suppose that three sepa- 
rate degrees of convergence could be 
induced by a stimulus pattern that 
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has ceased to operate before the first 
convergence has occurred. 

It might be argued that at the 
moment of stimulation the Ss retinas 
might be receiving stimulation on cor- 
responding points from one pair of 
flashes, and thus the eyes might be 
caused to change their degree of con- 
vergence by one of the other sets of 
flashes. This would furnish both 
retinal and proprioceptive cues for the 
relative depth of these two stimulus 
patterns. It would not, however, 
afford proprioceptive evidence for the 
relative depth of the third light. The 
only fixation point that might serve 
for a base of reference other than the 
flashes was the margin of the circles 
within which the flashes occurred, and 
these had no fixed spacial relation to 
the distances separating the flashes. 
But to be on the safe side we have 
assumed that the chances are even 
that an S would or would not name 
the depth of all three flashes correctly 
from such muscular cues as are 
afforded. Following are the results. 

Of the 12 Ss, six made no errors in 
the 30 trials, one made 1 error, one 
made 6 errors, two made 7 errors, one 
made g errors, and one made Io errors. 
In the total of 360 responses there 
were 40 errors, or 89 percent correct 
responses. Chance expectation is .5, 
SD = .026. The difference is 15 
times the SD of the difference. 

The conclusion seems to be justified 
that proprioceptive cues from changes 
of convergence during fixation are not 
necessary for stereoscopic depth per- 
ception and that position disparity of 
retinal stimulation in the two eyes is 
sufficient. 

(Manuscript received August 9, 1948) 
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NEW DATA ON THE INFLUENCE OF FREQUENCY 
AND OF MIND SET 


BY EDWARD L. THORNDIKE 


I. Tue NATuRE OF THE 
EXPERIMENTS 


The very simple experiment of 
writing or saying words that begin 
with a certain set of letters (such as 
ab, ac, ad, ba, be, bi, def, den, indi, and 
ins) provides facts concerning (A) 
the influence of frequency of occur- 
rence in the strengthening of mental 
connections and (B) the influence of 
the mind’s set in the arousal of mental 
connections to action. 

I report here certain experiments 
with college seniors, graduate stu- 
dents, and myself. In all these ex- 
periments: 

1. Errors by thinking of a word 
which does not begin with the specified 
set of letters or of something that is 
not a real word, are so rare as to be 
negligible. 

2. Some completions come to mind 
immediately as a consequence of the 
set of mind and the sight of the set of 
letters (with or without some pronun- 
ciation in inner speech). Some come 
similarly but after an empty interval 
of a few seconds. For example, such 
direct evocation is adequate to ac- 
count for my first five completions of 
dic (dictionary, dice, dicast, Dickens, 
dictator). 

3. Some are evoked partly or 
wholly by words already given—for 
example, my sixth, seventh, and 
eighth completions of dic (dictate, 
dictatorial, and dictatorship). It is 
almost always possible to distinguish 
such indirect evocations in a person’s 
record; for reasons given later they 
are omitted from consideration in this 
report, unless the contrary is stated 


specifically. For example, proved or 
proof following prove was omitted. 
They occurred fairly often with me 
but are very rare in the records of 
other subjects. 

4. When words do not come to 
mind as in (2) or (3), the subject may 
try the effect of adding a certain 
letter or sound to the stimulus. For 
example, my completions for mal were, 
in order, malaria, male, mala (Latin), 
malum (Latin) malice, malapert, mal- 
acca, malison, malt, and malkin. I 
stopped at that point, but now that I 
seek suggestions by adding a, b, c, d, 
etc., I think of malcontent, Malden, 
malfeasance, malgré, malheur, malign, 
mall, mallow, and others. In the ex- 
periments with myself as subject I 
deliberately avoided all such aids and 
stopped each experiment when words 
of class 2 ceased coming to mind, but 
some may have intruded themselves, 
though very rarely. In all the experi- 
ments with other subjects, process 4 
is absent or very rare. No record 
such as ‘sad, sag, sam, sat, say,’ or 
‘sob, sock, sod, son, sou,’ or ‘sea, sec, 
sell, set, sen,’ or ‘pad, pal, pan, pat, 
paw’ occurs in the work of any of the 
34 graduate students with tasks such 
as writing quickly five words that 
begin with sa, sh, s0, sp, pr, pa, etc. 

As would be expected, the positive 
influence of frequency of occurrence 
in the person’s past experience is 
demonstrable with certainty. It will 
be found true with all persons that, 
among the sequents to which a set of 
letters had led in past experience, 
one to which it has led often will be 
evoked in preference to one to which 
it has led less often, other things 
being equal. 


395 








396 


This pressure toward completion 
by frequent past sequents of the 
specified sets of letters is still potent 
and causes many ‘wrong’ completions 
to come to mind if the subject is 
instructed or make words that are 
verbs, or adjective, or Latin words, 
or names of persons or places, or rare 
words. 

The existence of a strong tendency 
for ann to evoke announce is con- 
sistent with weaker but real tend- 
encies for it to evoke annates, annular, 
etc. The experiment taps a hidden 
reservoir or equipment of tens of 
thousands of connections each of 
measurable strength.' This reservoir 
or equipment is of little importance in 
the practical uses of words and syl- 
lables, but a dynamically similar reser- 
voir of the tendencies of words and 
phrases to evoke meanings is of great 
importance. Both of these equip- 
ments are different from the equip- 
ment of arithmetical connections lead- 
ing from 3 + 2, 3 — 2, 3 X 2, etc., 
or the equipment of spelling connec- 
tions leading from the sounds of 
various words, inasmuch as the latter 
have been subject to much more pre- 
vention and elimination of ‘wrong’ 
connections, leaving as a rule only 
one that is ‘right’ and has enormously 
greater strength than all others in 
educated persons. 

Experiment I makes a beginning at 
measuring the relative strength of the 
different connections evoked by ab, 
ac, ad, and 135 other sets of two (or 
rarely three) letters in a composite of 
six college students, who wrote one 
completion for each. After an inter- 
val 131 of the 138 tasks were done 


1 Reservoir is a misleading term for the realities 
in the neurones which cause the tens of thousands 


of probabilities in question. Equipment is none 
too good a term. For the right term we must 
wait until we know what the anatomical or 
physiological realities in the neurones are. 


EDWARD L. THORNDIKE 


again. I use the two completions in 
these cases. 

Classifying the completions accord- 
ing to frequency of occurrence in a 
miscellaneous collection of books, 
magazines, etc. we find that comple- 
tions making very common words 
(appearing over 100 times per million 
running words) occur 12 times as 
often as they would if drawn by 
chance from all the words a college 
student might reasonably be expected 
to have seen, heard, spoken, or 
written, and that completions making 
very rare words occur less than one 
sixth as often as they would if so 
drawn by chance. 

The remarkable fact is not that 
completions making rare words (such 
as ammeter and aural) are so much 
less frequent, but that they had as 
great frequencies as they did. It is 
remarkable that the first completions 
of du, ge, hi, and ni to come to mind in 
these six persons should ever be 
duofold, gewgaw, gullible, hijack, and 
nihil. It might well be supposed that 
a certain threshold value of the com- 
bined forces of frequency, recency, 
and intensity would be required and 
that du — duofold, ge— gewgaw, gu 
— gullible, etc. would be below that 
value. It is remarkable that of the 
hundred or more words beginning 
with ge which a college student has 
seen or heard or written or said, 
gecko, geezer, Gehenna, generalissimo, 
and other such rarities should have 
made persisting brain connections 
with ge that are as truly and definitely, 
tho not as often, operative as those 
made by get, general, generation, gener- 
ous, genius, gentle, and gentleman. 

The same fact appears in words 
written in Experiment II by the 
composite of 34 graduate students. 
Of these, 34.3 percent were words 
occurring over 1800 times in the 
Thorndike-Lorge counts, but words 
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of every degree of rarity down to such 
as sadistic, saponify, shill, shush, 
semantics, selenium, sudorific, praecox, 
polarity, and podium appear also. 


II. Tue CausaTION oF THE EgQuiP- 
MENT OF CONNECTIONS EvoKING 
Worps as COMPLETIONS OF AN 

IniT1AL LETTER oR LETTERS; 
THE INFLUENCE OF A 
Person’s Past 
EXPERIENCES 


Presumably the strength of each of 
the connections operating in these 
experiments is a resultant of its 
recency, frequency, intensity, and 
satisfyingness. 


I shall try in this section to determine how 
much of the strength is caused by frequency, if 
recency, intensity, and satisfyingness are equal. 
For example, if the word sallow has occurred in a 
person’s experience three times as often as the 
word saline, will the former be three times as 
likely as the latter to be evoked as a completion 
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of sa? If not three times as likely, how many 
times as likely? If the word salt has occurred 
in a person’s experience 127 times as often as the 
word saline will salt be 127 times as likely as 
saline to appear as a completion of sa? If not 
127 times as likely, how many times as likely? 

We do not, of course, know the number of 
times that any of the six persons of Experiment I 
or any of the 34 persons of Experiment II had 
experienced any of the sets of two or three letters 
in any words, but it is possible to obtain from 
the Thorndike Lorge counts approximate esti- 
mates of the relative number of occurrences in 
the past experiences of certain sorts of persons of 
any two words, and of any two groups of words 
classified by number of occurrences. 

Suppose that the past experiences of the 34 
graduate students in seeing, hearing, writing, 
and saying words beginning with sa, sh, so, sp, 
se, St, 5u, pr, pa, po, or n paralleled exactly the 
Thorndike-Lorge counts. Then in a sample of 
18,000,000 they would have experienced the 15 
commonest of these words 436,736 times, the 
25 next commonest 188,576 times, the 137 next 
commonest 397,090 times, the 179 next common- 
est 226,020 times, and so on as shown in Column 
C of Table I. This is not unreasonable. For 
words occurring 36 times or less in the 18,000,000 
of the Thorndike-Lorge counts it may be more 


TABLE I 


Revative Frequencies or Worpvs BEGINNING WITH 52, sh, 50, sp, 5¢, st, Su, Pr, pa, po, AND n 
In THE Past Experience oF 34 GrapuaTe StupenTs as EsTiMATeD FROM THEIR FRE- 
QUENCIES IN THE THORNDIKE-LorGeE Counts, AND aS Osserven 1Nn Exp. II 





Facts Computed from the T-L Counts 


Relative Frequencies in Permilles 





A Cc D 


E J G H(G+ E) 





Degree of 
Commonness 
in the T-L 
Counts 


Number of 

Occurrences 

in the T-L 
Counts 


Adjusted 


Column C 


as Shown 


Ratio of 
Observed to 
Estimated 
from C 


Estimated 
from C 


Estimated 
from D 


Observed 
in Exp. Il 





10000 or over 436736 
188576 
397090 
226020 
38880 
49140 
55350 
91530 
58930 


43533 
19521 X .8 = 15616 


14630 X .§ = 7315 
2660 X .2= 532 




















17.50 








* Estimated. 


From unpublished records of words with occurrences of 4 or less in the counts and 


from the facts in dictionaries, it seems reasonable to set the number of words that would figure ap- 


preciabl 


in the experience of graduate students or college seniors as roughly equal to the number 


with T.L. credits of 5 to 17. Any reasonable estimate will not alter the general picture of Table I 


appreciably. 
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reasonable to suppose that the past experiences 
fall below the Thorndike-Lorge counts. Drastic 
reductions by 20, 50, and go percent, respec- 
tively, for words with Thorndike-Lorge credits 
of 18-35, 5-17, and 4 or less give the facts shown 
in Column D. When the frequencies are put in 
permilles for easier inspection, we have Columns 
E and F as two estimates of relative frequencies 
in the past experience of the group of 34 students. 
The drastic reductions for the three rarest groups 
makes little difference in the general picture. 
Any reasonable allowance that any competent 
psychologist or student of language may make 
for the greater relative frequency of common 
words in speech and hearing and the greater rela- 
tive frequency of rare words in the serious read- 
ing of these students than in the Thorndike- 
Lorge counts will not alter the permilles of 
Column E or F greatly. 


The observed relative frequencies 
in the experiment, shown in Column 
G, are far below the frequencies in 
past experience for the very common 
words and increasingly above them 
for the rarer words. This will still 


hold true for any reasonable estimate 
of the frequencies in past experience. 
There is thus evidence of diminishing 


returns from added experiences of a 
word. This is shown even more 
clearly by the ratios of the frequency 
observed in the experiment to the 
frequency in past experience, in 
Column H. 

A similar treatment of the records 
from the six college seniors who com- 
pleted the 138 sets (ab, ac, ad, etc.) 
gives the ratios of the relative fre- 
quencies of the completions written 
to the estimated relative frequencies 
in past experience which are found in 
the second column of Table II. 


In Experiment III, 129 graduate students 
wrote words completing these ten sets of letters: 
ab, ag, ba, ca, el, fo, in, la, me, and pl, one com- 
pletion for each set of letters. A scrutiny of the 
relative frequencies of the completions of each 
set in comparison with what would be expected 
if they had paralleled the relative frequencies in 
the persons’ past experiences shows wide varia- 
tions in the relation, which, however, are con- 
sistent with a general status of the relation much 
like that found in Experiments I and II. 
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TABLE II 


Tue Ratios or Frequency OssErvVED IN 
THE EXPERIMENTS TO EsTIMATED FRE- 
quency 1n Past Experience FoR 
Eacu Crass or Worps 1N 
Experiments I, II 
anp III 








Class of Words 
by Degree of 
Commonness 

in T-L Counts 


Exp. II | Exp. III se 
129 


34 
Graduate| Graduate Over All 


Experi- 
Students} Students ments 


e a 
lege 


Seniors 





1800 or over 


10000 or over 
4000-9999 
1800-3999 


2.14 
3-34 
3-15 

13.75 

















Four tables like Table I were then made for 
ab, ag, and el combined, fo, /a, and me combined, 
ba and ca combined, and in and pl combined. 
These tables are not shown here. The relation 
still shows much variation. The fact is that 
wide sampling of sets of initial letters is even 
more important than a wide sampling of persons 
in order to determine the influence of frequency 
of occurrence of a word in past experience upon 
frequency of evocation of it by its initial letters. 
When the corresponding entries under ‘esti- 
mated’ in these four tables are averaged (with 
weights of 3, 3, 2, and 2, respectively), and like- 
wise the entries under ‘observed,’ and the ratios 
of observed to estimated are computed from 
these weighted averages, we obtain the facts of 
the fourth column of Table II. These are in 
general accord with the facts from Experiments 


I and II (shown also in Table II). 


The diminishing returns shown in 
these experiments might have been 
caused not only by smaller increments 
from successive occurrences of the 
words per se, but also by differences 
in intensity, recency and satisfying- 
ness. The satisfyingness of hearing, 
seeing, saying, or writing the words 
may be disregarded as a cause of 
diminishing returns from the con- 
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nection between the initial letters and 
the words, because there is no reason 
to believe that is it negatively cor- 
related with frequency of occurrence. 

The recency of hearing, seeing, 
saying, Or writing a word has some 
influence in raising the relative fre- 
quency of evocations of rare words in 
the experiments, because in general 
the rarer a word is the later in life it 
is experienced. In the case of the 
129 graduate students, for example, 
the following words in the o-4 
Thorndike-Lorge group may have 
come to mind because of recency: 
abscissa, agenda (2), agglutinate, 
agronomy, agronomics, calibrate, intro- 
version, and mechanistic. There are 
also doubtless a few cases where 
events within a few hours of the ex- 
periment cause certain completions, 
but such may be of all degrees of 
commonness in the T-L counts, and 
so may be treated as of little conse- 
quence in estimating the curve of 
diminishing returns from successive 
occurrences per se. 

Intensity cannot be disregarded. 
Sh—e, should, so—me, and su->ch 
are very common in past experience 
but are rarely profound or exciting 
experiences. She was certainly the 
most frequent completion of sh in the 
past experiences of the 34 students, 
but appears in their completions only 
five times, much less often than shoe. 
The word for was written only 17 
times by the 129 persons, but prob- 
ably occurred in their past hearings, 
seeings, sayings, and writing half as 
often as all other fo words together. 
On the average the articles, preposi- 
tions, conjunctions, pronouns, and 
auxiliary verbs are of extremely high 
frequency but extremely low intensity, 
being swamped by livelier words and 
acting like James’ transitive states 
or “fringes of thought.” Conse- 
quently there would be some discrep- 
ancy between frequency in past ex- 
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perience and frequency in the com- 
pletions of our experiments in the 
commonest thousand words (T-L 
scores of 1800 or over), even if there 
were no general law of diminishing 
returns. 

In the case of Experiment III, with 
completions of ab, ag, ba, ca, etc., 
by 129 persons, I have measured the 
influence of intensity roughly by 
omitting from consideration above, 
about, again, ago, can, case, else, for, 
forasmuch, former, formerly, inasmuch, 
indeed, insomuch, instead, into, latter, 
latterly, meantime, and meanwhile, 
which seem the weakest score of the 
ab, ag, ba, ca, etc., words in the T-L 
list. The ratio: (observed in Experi- 


ment III)+(estimated frequency in 
past experience), rises by 38 percent 
for words over 10,000 in T-L, and 
falls moderately (around 23 percent) 
for all the other classes of words. 


By the courtesy of Dr. Calvin W. Taylor of 
the University of Utah I am able to study the 
relation of frequency of occurrence in past ex- 
perience to the words written by 203 high-school 
seniors in test 14 described in his article in 
Psychometrika, Dec. 1947. Their task was to 
“Write as many words as you can which begin 
with s.” The time limit was 34 min. Nine- 
tenths of the subjects wrote from 25 to 52 words. 
Nobody wrote fewer than 20. I used the first 
word and also the first five words written by 
each person, but omitting words obviously sug- 
gested by a word already written (such as ‘say- 
ing’ or ‘said’ following ‘say,’ ‘sorrow,’ following 
‘sorry’ or ‘sang’ following ‘sing’). For 180 per- 
sons I used also the first 10 words written. 

As a fore-exercise to writing words beginning 
with s, Taylor had these students write words 
beginning with P. Seven lines were left for this, 
but the number written was often less. We can- 
not be as sure that the P words are fair samples 
of those that came spontaneously to the subjects’ 
minds, as we can in the case of the S words where 
the subjects tried to write as many words as they 
could. With the P words a subject might dis- 
card some that came to mind in order to write 
words that might look better. One subject, for 
example, wrote pulchritude, pessimist, pecuniary, 
phonigraph (sic) and play. I used (a) the first 
word written, and (b) all the words written in 
this fore exercise. 
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As a third variant of these experiments I have 
used the five-letter words written in Taylor’s 
fore-exercise to his “Number of letters” test. 

As a reasonable estimate of the relative fre- 
quencies of words in the past experiences of these 
high-school seniors, I multiplied the T-L fre- 
quencies of groups 1800 or over by 1.30, those 
of goo-1799 by 1.10, those of 720-899 by 1.05, 
those of 180-719 by 1.00, those of 90-179 by .go, 
those of 36-89 by .80, those of 18-35 by .70, those 
of 5-17 by .20, and those of o-4 by .o5. 


The records for S words, P words 
and 5-letter words from the high school 
students do not agree closely with 
each other or with those from the 
college and graduate students.? On 
the whole they do show specially small 
effects per occurrence for experiences 
of the very common words and speci- 
ally large effects per occurrences for 
experiences of the very rare words. 

If we compute the ratios of fre- 
quency observed in the Taylor data 
to estimated frequency in the past 
experiences of high-school seniors for 
first S word written, first five S words 
written, first 10 S words written, first 
P word written, all P words written, 
and all 5-letter words written, and 
average the six for each group, we 
have, from the commonest to the 
rarest group, in order, .40, .25, 1.19, 
1.26, 1.55, 2.22, 1.45, 1.57, 1.57, 1.52, 
2.79, 4.57, and over 10.00. 

Using the facts from all the experi- 
ments and making allowances to 
equalize for intensity of the experi- 
ences, I obtain estimates of the ratios 
of relative frequency observed in the 
persons’ responses to relative fre- 
quency of the words in their past 
experiences as given in the last column 
of Table IT. 

Reasonable estimates of thestrength 
added per occurrence are as follows: 
Let E equal the average number of 
experiences that a person has had of 
absentee, acetylene, adroitly, adultera- 

2 To save space, I omit all the tables of rela- 


tive frequencies for the high school seniors, and 
the details of the construction of these tables. 
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tion, affidavit, aloes, analytic, andirons, 
anthology, anthropologist, and the other 
words that occurred exactly once per 
million in the Thorndike-Lorge 
counts.2 Let S equal the average 
amount of strength added to the 
connection between the first two 
letters of a word and the word as a 
whole by the (£)th experience of the 
word. Then the (2£)th experience 
will add about .73S; the (10£)th 
experience will’ add about .63S; the 
(so£)th experience will add about 
375; the (200£)th experience will 
add about .17S; the (Soo£)th experi- 
ence will add about .o9S; the ($£)th 
experience will add about 1.3S. 


III. Tue INFLUENCE oF 
THE Minp’s SET 


This section will report certain 
facts of general significance con- 
cerning the influence of the set (or 
adjustment, or task) of a person upon 
his evocation of words. 

I have noted in Section I that in- 
structions to summon to mind words 
of a certain sort beginning with cer- 
tain letters (e.g., Latin words begin- 
ning with ba, or names of persons and 
places beginning with be), do not pre- 
vent other words, particularly com- 
mon words, that begin with the 
letters in question from coming to 
mind. This is not to say that the 
mental sets caused by such instruc- 
tions were impotent. On the con- 
trary, in one way or another, they 
made demonstrable differences in 
the sorts of completions written. 

An easy case for mental set to 
dominate is the restriction to words of 
a language other than the person’s 
native tongue. My knowledge of 
Latin is meager, but when set to 
think of Latin words beginning in 


31 conjecture that E will be between 10 and 
30 for the majority of college graduates. 
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ce, ct, de, di, etc. I averaged 26 percent 
of Latin words (using in each case the 
first 10 written), though for the same 
36 sets with no restrictions or favor- 
itisms only 1.6 percent of the words 
written were Latin. 

A person whose native tongue is 
English but who knows a second 
language well could probably set his 
mind to evoke words from that second 
language so that eight or ten such 
would come to mind beginning with 
ba, or be, or bi, etc., with few intru- 
sions from English. The words of 
two languages known by a person 
usually remain in his mind in rather 
insulated systems. The mental set 
caused by the instructions may then 
operate by putting one or another 
system into action. 

A novel case is where the mind is 
set to get proper nouns and adjectives 
(names of persons, places, etc.). 
These are not separated from common 
nouns in a person’s experience except 
in so far as they have (1) been dis- 
tinguished by capital letters, (2) 
been oftener individual rather than 
class names, (3) been more frequent 
in histories and geographies than in 
science and fiction and more frequent 
on envelopes than in letters, and (4) 
been treated differently in various 
minor respects. They have not 
formed a separate organized system 
comparable to the words in a second 
language. Nevertheless, the mental 
set influenced the appearance of these 
words clearly and emphatically. I 
wrote completions to ba, bo, bu, and 
26 other sets of two letters each con- 
sisting of a consonant followed by a 
vowel, trying to summon names of 
persons, places, etc. A total of 269 
words came to mind, of which 84, or 
31.2 percent, were such. One month 
later I wrote completions to 16 of 
these 29 sets without any attempt to 
summon any particular sort of words, 


401 


and wrote 176 words, of which only 
8.0 percent were names of persons, 
places, etc. I also wrote completions 
to 35 similar consonant-vowel sets 
(ni, no, nu, etc.) without any attempt 
to summon any particular sort of 
words, and wrote 444 words of which 
only 7.9 percent were names of 
persons, places, etc. In a long series 
of completions of three-(or rarely 
four-) letter sets (such as amb, ant, 
cli, clu) written without any attempt 
to summon any particular sort of 
words, I wrote 908 words, of which 
6.7 percent were names of persons, 
places, etc. The mental set thus 
quadrupled the percentage of proper 
names. 

A possibly more instructive case is 
where the mind is set to evoke rare 
words rather than common ones. 
Here there is no obvious distinguish- 
ing mark, though an experienced 
person might assume that long words, 
foreign words, and names of persons 
and places would be rare on the 
average, and might benefit by setting 
his mind to evoke long, foreign, or 
proper names. I was well aware of 
this, though I did not deliberately set 
my mind to evoke such, but only to 
evoke rare words My mental set 
was clearly and emphatically in- 
fluential. Using 94 sets (ann, aut, 
bon, etc.) for which I had summoned 
words without any mental set other 
than to write whatever completions 
came to mind, I set myself the task 
three to five days later of trying to 
summon rare words. I got words 
with Thorndike-Lorge scores below 
36 1} times as frequently as in the 
‘regular’ experiments. This differ- 
ence came from goo ‘regular’ com- 
pletions and 760 completions when 
trying to summon rare words. If the 
number of completions used for each 
set of letters is the same in both ex- 
periments, those in excess of that num- 
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ber (703) being omitted, the corre- 
sponding ratio is 1.8. 

As a check I used 100 new sets of 
three letters (or rarely four) divided 
into two random halves, always 
stopping the recording after six com- 
pletions had come to mind (or before 
that in 23 cases, in which the flow of 
words stopped before six had ap- 
peared). In one half I set myself 
to summon rare words; in the other 
(done a day later), I set myself to 
summon common words. When set 
to summon rare words I got words 
below a T-L score of 18 1.6 times as 
frequently as when set to summon 
common words. 

Fifteen days later I used the same 
100 sets again, but reversed the sets 
of mind. This did not succeed in 
reversing the percentages of rare 
words. That for rare words when 


set to summon rare words was almost 
exactly the same as it had been with 


the other fifty sets of letters, but that 
for rare words when set to summon 
common words rose a great deal. As 
a result I got words below a T-L score 
of 18 only 1.1 times as frequently when 
set to summon rare words as when 
set to summon common words. 

The flexibility and speed of the 
mind’s adjustments may be tested by 
alternating attempts to summon 
proper names with unrestricted sum- 
monings, or attempts to summon rare 
with attempts to summon common 
words. Cards were prepared with p 
or reg, or r or c, marked on them in 
addition to the set of letters to be 
completed, and arranged so that the 
subject (myself) set himself for p and 
unrestricted in alternation, or for rare 
and common in alternation. Getting 
set was a matter of a couple of sec- 


4 Twelve of the 23 were in sets where I was set 
to summon rare words. 
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onds or less, consisting merely of 
thinking ‘proper’ ‘regular,’ ‘rare’ or 
‘common.”® 

In all these alternation experi- 
ments I stopped writing completions 
of a set of letters after six had ap- 
peared or before that if none appeared 
within about fifteen seconds after the 
lastone written. As always I avoided 
searching by systematically trying a 
series of letters or sounds, and did not 
use any words clearly suggested by a 
previously appearing word. 

Set was potent even in this rapid 
alternation. Of 394 words written on 
p cards, 22.8 percent were proper 
names, whereas of 430 words written 
on reg. cards, only 10.7 percent were. 
The potency of the p set was however 
less in these alternating experiments 
than when it was maintained con- 
tinuously (31.2 percent as noted 
earlier). An interesting secondary 
phenomenon is the interference of the 
p set with the operation of the reg. 
set. The percentage of p words 
evoked in the reg. set here was 10.7, 
compared with 8.0, 7.9 and 6.7 per- 
cent in continous experiments with 
unrestricted summoning of words. 

I made extensive experiments in 
alternating my mind’s set between 
‘rare’ and ‘common.’ In these any 
possible differences in the probability 
of rare versus common words being 
evoked by the letters themselves, re- 
gardless of the set of mind, were eli- 
minated by repeating the experiment 
40 days or more later with the cards 
that had been marked C now marked 
R and those that had been marked R 
now marked C. 

In spite of the rapid shifts, mental 
set was potent. With a series of 60 


5 In the case of a few words, I became aware 
of having slipped into the wrong set after having 
written several completions. I made no correc- 
tion for this. 
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sets of letters (the sib, sim series) the 
results were as follows: When I was 
set to summon rare words, 18 percent 
of the words appearing had T-L scores 
of 540 or more occurrences per 18,000, 
000 words; 26 percent had T-L scores 
of 36 to 539; 56 percent had T-L 
scores of o to 35. When I was set to 
summon common words, 28 percent 
of the words appearing had T-L scores 
of 540 or over, 28 percent had scores 
of 36 to 539; 44 percent had scores of 
© to 35. 

With another series of 80 (the abs, 
aut series), the results were as follows: 
When I was set to summon rare words 
the percentages for the T-L scores 
noted above were 16, 30 and 54. 
When I was set to summon common 
words they were 21, 36 and 43. 
Averaging the results for the two 


experiments, we have: 











ver o-35 





When set for rare 
When set for common 


17 28 5s 
244 32 


434 














Being set for rare versus being set for 
common reduces the appearances of 
common words by 30 percent and in- 
creases the appearances of rare words 
by 26 percent. 

Much earlier the abs, aut series of 
80 had been used with the mind set 
for ‘rare’ and even earlier with the 
mind set merely to evoke words. As 
would be expected, the mind sets for 
rare and for common are less potent 
when alternating than when main- 
tained alone. The facts are as follows: 








T-L Frequencies per 18,000,000 





180 or Over 36-179 





Set for rare maintained throughout 
Set for rare in alternation with set for common 


Difference (alternation minus rare) 
Set unrestricted (i.e., to write any word) main- 


tained throughout 
Set for common in alternation with set for rare 


Difference (alternation minus unrestricted) 





18% 18% 

16% 30% 
—2 12 
23% 
36% 
—13 13 


34% 


oF 
21% 
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CHANGES IN THE ATTRACTIVENESS OF ACTIVITIES: 
THE EFFECT OF EXPECTATION 
PRECEDING PERFORMANCE 


BY MILDRED E, GEBHARD 


University of Pennsyloania 


The results of a recent experiment 
led to the conclusion that the changes 
in the attractiveness of activities 
which accompanied success and failure 
were a function not only of the experi- 
ence of success or failure, but also of 
the expectation of future success or 
failure, and of the strength of the need 
to be successful or to avoid failure in 
the experimental situation. In gen- 
eral, the experience of success was 
accompanied by an increase in: the 
attractiveness of the performed activ- 
ity while the experience of failure led 
to a decrease of its attractiveness. 
The expectation of future success was 
accompanied by an increase in the 
attractiveness of the performed activ- 
ity, while the expectation of future 
failure led to a decrease in its at- 
tractiveness. These changes were 
significant when the need to be suc- 
cessful or to avoid failure was rela- 
tively weak, but were not significant 
when a strong need existed. 

The present experiment attempts to 
explore still further the effect of suc- 
cess and failure upon the attractive- 
ness of activities. In the previous 
experiment, changes in the attractive- 
ness of an activity were shown to be a 
function of expectation of future suc- 
cess or failure in the performance of 
that activity. The question is raised: 
are changes in attractiveness, which 
accompany the performance of an 
activity, also a function of the expect- 
ations existing prior to the experience 
of success or failure? 

In the present experiment as in the 
previous one, each S ranked a num- 


ber of tasks in order of attractiveness 
and performed the task given the 
median rank. Again, as before, ex- 
pectation and experience of success or 
failure were defined operationally, but 
the experimental procedure was care- 
fully planned to increase the likeli- 
hood that genuine anticipations and 
feelings of success or failure would 
occur. In the present experiment, in 
contrast to the previous one, expecta- 
tion was experimentally varied before 
rather than after performance of the 
task. Its variation was effected, as 
before, by presention of pseudo-evi- 
dence as to the atypical easiness or 
difficulty of the task. Variation in 
experience, as before, was effected 
primarily through the use of pseudo- 
norms. In the present experiment, 
data were obtained only under the 
condition of little need for success or 
for avoidance of failure. 

Predictions of changes in the at- 


_ tractiveness of the performed activity 


were as follows: (1) that attractive- 
ness would rise when experienced 
success followed expected failure and 
would fall when experienced failure 
followed expected success; (2) that 
changes in attractiveness would be 
less consistent when expectation and 
experience were in agreement, both of 
success or both of failure, but would 
rise somewhat with expected and 
experienced success and probably! 
would fall somewhat or remain un- 
changed with expected and experi- 

1 The latter half of this prediction was offered 


with some hesitation. See discussion which 
follows. 
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enced failure; (3) following from the 
above predictions, that changes would 
be greater? when expectation and 
experience were in opposite directions, 
one of success and one of failure, than 
when they were in the same direction, 
both of success or both of failure. 


These predictions are based upon the 
following principles. (1) Anticipations 
(expectation) of success or failure in the 
performance of a task (a) may result in 
relatively higher or lower levels of aspi- 
ration respectively, and (b) may estab- 
lish, in advance, a factor which can be 
considered responsible for achievement. 
(2) Feelings (experience) of success or 
failure do not depend upon the absolute 
level of achievement, but may be a 
function of (a) levels of aspiration pre- 
ceding achievement and (b) the factor 
considered responsible for achievement. 
(3) Changes in the attractiveness of an 
activity are a function of feelings of suc- 
cess or failure in the performance of the 
activity. These principles are derived, 


in large part, from earlier experimental 


studies and observations. 

1. With regard to the first principle it 
must be recalled that expectation was 
varied by presenting pseudo-evidence 
that the series of problems performed by 
the Ss were, by chance, unusually easy 
or difficult. (1a) If a task is perceived 
or believed to be difficult it has been 
shown (2) that levels of aspiration are 
generally lower than if the task is per- 
ceived or believed to be easy. (1b) 
Since this difficulty or easiness appears 
to be a function of chance variation in 
difficulty, a factor which can be con- 
sidered responsible for ensuing success 
or failure is immediately suggested. 
Achievement is a function of personal 
ability and effort, but it may also be a 
function of uncontrollable situational 
factors. Failure on an unusually diffi- 
cult task or success on an unusually 
easy task may, with justification, be 
considered the function of uncontrolled 
variation in the difficulty of the task. 

2I.e., a greater difference between average 


rankings of attractiveness under the two con- 
ditions. 
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In contrast, success on an unusually 
difficult task, or failure on an easy one 
surely suggest considerable personal abil- 
ity and effort, or lack of such. 

2. The second principle is similarly 
derived. Lewin ef al. conclude that 
(2a) “The experiments show that the 
feeling of success and failure does not 
depend on an absolute level of achieve- 
ment. . . . What counts is the level of 
achievement relative to certain stand- 
ards, in particular to the level of aspi- 
ration (goal line): if achievement lies on 
or above the goal line, the subject will 
probably have a feeling of success; if it 
lies below the goal line he will probably 
feel failure, depending on the size of this 
difference and the ease with which the 
achievement has been reached” (2, pp. 
374-375). Similarly, (2b) feelings of 
success and failure may be a function of 
the relation between absolute achieve- 
ment and the factor held responsible for 
such achievement. “Only if the result 
of the action is ‘attributed’ to the person 
as actor and not attributed to other 
persons or to ‘nature’ can we speak psy- 
chologically of an ‘achievement’ of this 
person” (2, p. 375). It should be noted 
here that this factor assumes more im- 
portance under conditions of failure. 
“There is a tendency after failure to 
link the poor result to a faulty instru- 
ment, to sickness, or to any event 
‘outside the power’ of the individual in- 
volved. The fact that such severing of 
the link between the result and the in- 
dividual is more frequent after poor than 
after good achievement shows that it 
can be due to the force of avoiding 
failure” (2, p. 375). 

3. Support for the third principle comes 
from the previous experiment, in which 
experimentally varied experience re- 
sulted in increased attractiveness of the 
performed task under the condition of 
success and decreased attractiveness of 
the performed task under the condition 
of failure. Rated feelings of success and 
failure were in agreement with attempted 
experimental variation. 

If the assumption is made that these 
principles are basic and operative in this 
study, a further analysis of the present 
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experimental situation is possible. A 
schematic presentation of this analysis 
appears in Table I. The four combina- 
tions of experimentally varied expecta- 
tion and experience appear in Columns 
i and 2. The variable of expectation 
of success or failure was controlled by 
introducing pseudo-evidence that chance 
variation had resulted in the S receiving 
an unusually easy or difficult task 
(Column 3). It is assumed that antic- 
ipations of success or failure were in line 
with experimental variations and led to 
high and relatively lower aspirations re- 
spectively (Column 4). The variable 
of experience of success or failure was 
controlled by introducing pseudo-evi- 
dence of two degrees of achievement 
(Column 5). The factors probably ac- 
cepted by S as responsible for this 
achievement (Column 6) are a function 
of the chance difficulty of the task 
(Column 3) and the level of achievement 
(Column 5). Thus, high achievement 
on an easy task is, at least in part, a 
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function of its easiness, whereas high 
achievement on a difficult task reflects 
merit solely on one’s personal ability and 
effort. Likewise, low achievement on 
an easy task suggests lack of personal 
ability or effort, whereas low achieve- 
ment on a difficult task is, at least in 
part, a function of its difficulty. Actual 
feelings of success or failure (Column 7) 
are very likely a function of the level of 
aspiration (Column 4) in conjunction 
with the level of achievement (Column 
5) and the factor seemingly responsible 
for this achievement (Column 6). Thus, 
a high level of achievement is likely to 
yield feelings of success, but these are 
likely to be stronger if the discrepancy 
between aspiration and achievement is 
large and if the success appears to be a 
function of personal ability rather than 
of chance variation in the difficulty of 
the task. A low level of achievement is 
likely to yield feelings of failure, but 
these are likely to be stronger if the dis- 
crepancy between aspiration and achieve- 


TABLE I 


Scnematic ANALysis OF Prepicrep Errect or EXPERIMENTAL VARIABLES OF EXPECTATION 
AND EXPERIENCE UPON ATTRACTIVENESS OF PERFORMED ACTIVITY IN 
PRESENT AND Previous ExpeRIMENTS 


Expect. = Expectation; Exper. = Experience. 








Present Experiment | 


{ 





Experimental 
Condition Probable 
Level of 


Aspiration 


Nature of 
Task 
(Chance) 





Expect. Exper. 


Level of 
Achieve- 
ment 


Actual 
Feelings 
(Exper.) 


Factor in 
Achieve- 
ment 


Change in 
Attrac- 
tiveness 





Success 
Failure 
Success 
Failure 


Success 
Success 
Failure 
Failure 


Easy 
Difficult 
Easy 
Difficult 


High 
Lower 
High 


Lower 











8o0%ile 
8o%ile 
20°ile 
20%ile 


Rise 
Rise 
Fall 
Fall? 


Success 
Success 
Failure 
Failure? 


Situation 
Person 
Person 
Situation 














Previous Experiment 





Experimental 
Condition Probable 
Level of 


Aspiration 


Level of 
Achieve- 
ment 





Exper. Expect. 


Nature of 
Task 
(Chance) 


Actual 
Feelings 
(Exper.) 


Factor in 
Achieve- 
ment 


Change in 
Attrac- 
tiveness 


Average 
Change in 
Attract.* 
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CHANGES IN ATTRACTIVENESS OF ACTIVITIES 


ment is large and if the failure appears to 
be a function of lack of ability, rather 
than of chance variation in the difficulty 
of the task. The predicted changes in 
the attractiveness of the activity (Col- 
umn 8) fall directly in line with the feel- 
ings of success or failure, greatest in- 
creases in attractiveness with strongest 
feelings of success, greatest decreases in 
attractiveness with strongest feelings of 
failure. 

Actually no information was obtained 
during the experiment as to the Ss’ 
levels of aspiration or as to their rec- 
ognition and acceptance of the factors 
which could be held responsible for 
performance. These intervening vari- 
ables have been assumed to be operating 
and to be operating in the direction in- 
dicated by relevant studies. However, 
some support for these assumptions can 
be found in a similar analysis of the 
previous experiment in comparison with 
the results obtained in that study. 

In the previous experiment, expecta- 
tion of success or failure was experi- 
mentally varied after the task has been 
performed and the level of achievement 
reported to the S. As there was no 
systematic variation of expectation 
before performance, it can be assumed 
that the aspirations of Ss in the various 
experimental conditions were compar- 
able and probably relatively high (Col- 
umn 3). Experience was experimentally 
controlled by variation in the reported 
level of achievement (Column 4). Fu- 
ture expectation was then varied by 
introducing pseudo-evidence as to chance 
variation in the difficulty of the task 
(Column 5). Independent of level of 
achievement, if the performed task was 
atypically difficult, future levels of 
achievement should be higher; if the 
performed task was atypically easy, 
future levels of achievement may be 
lower. The factors probably accepted 
as responsible for achievement (Column 
6) are a function of level of achievement 
(Column 4) and the chance difficulty of 
the task (Column 5). Actual feelings 
of success and failure (Column 7) were 
very likely a joint function of the level of 
aspiration (Column 3), the level of 
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achievement (Column 4) and the factor 
seemingly responsible for achievement 
(Column 6). The predicted changes in 
the attractiveness of the activity follow 
in line with the predicted feelings of 
success or failure. The average in- 
creases or decreases in the attractiveness 
of the performed activity are recorded in 
the last column. It is to be noted that 
the results follow the predictions with the 
exception of the condition of experienced 
failure—expected success, where there is 
a slight rise rather than a fall in attrac- 
tiveness. As this is the condition where 
failure can be attributed to situational or 
chance factors, it must be questioned 
whether failure was actually experienced 
under this condition. Relief from re- 
sponsibility for failure in the perform- 
ance of a task appears to be accompanied 
by increased attractiveness of that task. 
In consideration of these results the 
prediction for the corresponding situa- 
tion in the present experiment was 
offered with some hesitation, as the 
question marks indicate. 


EXPERIMENTAL DeEsIGN 
AND PROCEDURE? 


Nine tasks‘ were presented toeach S. After 
the tasks had been described, each S was told 
that the experiment would require performance 
of some of these tasks, and she was asked to 
rank them in the order in which she ‘liked to do’ 
them. The E determined the first task to be 
performed by randomly drawing one of nine 
folded slips in a box. The selected task, con- 
trary to appearance, was not randomly chosen 
but was always the one to which the S had given 
the median rank of five. 

In an attempt to establish a situation in 
which the need for success or for avoidance of 
failure was relatively weak, Ss were told that 
they were participating in a preliminary experi- 


3 The past and present experiments are very 


similar in design and procedure. Because of this 
similarity, the rather lengthy instructions have 
not been reprinted. The few changes made in 
the original instructions are listed in the ap- 
pendix to this paper. 

*The nine tasks were the ones used in the 
previous experiment: arithmetic problems, codes, 
cube counting, sentence completion, paper form- 
board, number checking, proverbs, symbol 
transcription, algebraic reasoning. 
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ment; that the E was not concerned with the 
quality of the S’s own performance; that the E 
sought information as to the nature of the tasks 
and the S’s reaction tothem. Ss were instructed 
to observe their own reactions to the tasks during 
performance, as preparation for answering ques- 
tions to be asked later. 

Every task consisted of 30 problems, each of 
which was printed on a separate 3 X § in. card. 
E shuffled the pack and, stating the need of a 
stop watch, stepped to shelves behind the S’s 
back, picked up a stop watch, and, unobserved, 
slipped a standard set of 10 cards of the same 
type of problem on the top of the deck. Return- 
ing, E counted off the top 10 cards, casually 
turned them over, and examined the symbols 
printed on their backs. A square had been 
printed on seven of the cards and an asterisk on 
the remaining three. Expectations of success or 
failure were experimentally varied by E£’s com- 
ments and by an interpretation of these symbols 
as representative of two degrees of difficulty. 
The symbols were so interpreted that half the 
Ss had reason to believe that through the opera- 
tion of chance they had received an unusually 
difficult set of problems; the other half, that they 
had received an unusually easy set. 

All tasks consisted of ten sub-tasks or prob- 
lems, and each S was interrupted after seven of 
these problems were completed. As E searched 
for a mislaid questionnaire, a set of norms was 
‘accidentally’ discovered, announced, and, as if 
upon sudden inspiration, checked against S’s 
performance. Experiences of success and failure 
were experimentally varied by E’s comments and 
by showing and interpreting the norms to the S. 
Two sets of norms were necessary. In the fail- 
ure norms, seven completed problems placed the 
performance at the 2oth percentile for college 
students. In the success norms, seven com- 
pleted problems placed the performance at the 
8oth percentile. £ expressed doubt that time 
should have been taken to check with the norms, 
but justified it on grounds that information re- 
garding one’s performance is always of interest. 

The digression concluded, E next pointed out 
that when we ‘actually do’ tasks we may change 
our opinion as to how much we like to do them. 
S was then asked to make a second ranking of 
the tasks but told not to be disturbed either by 
consistency or lack of consistency of this ranking 
with her previous ranking. Two questionnaires 
were then answered by each S._ The first ques- 
tionnaire requested ratings of the interest value 
of the performed task. The second question- 
naire requested ratings of S’s experience, future 
expectation, and satisfaction with performance 
on the experimental task. 

The two experimental variables were expecta- 
tion prior to performance and experience accom- 
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panying performance. With two degrees of 
variation in each, four combinations of the two 
variables were possible. As in the previous 
study, the experiment was planned according to 
the principles of factorial design. Ten Ss were 
randomly assigned to each of the four different 
conditions, a total of 40 Ss. All Ss were young 
women in the elementary psychology course at 
the University of Pennsylvania. The experi- 
ment performed in the middle of the fall term 
preceded any course content which might have 
prejudiced experimental results by increasing the 
sophistication of the Ss. Student volunteers 
were solicited in three laboratory classes. At 
the time of the appeal the students were informed 
that the experiment would take about one-half 
hour of their time. They were told that inter- 
communication between Ss would vitiate the 
experimental results, and were asked to volunteer 
only if they felt they could cooperate in this 
respect. They were shown hand-painted plaques 
which would be given them in appreciation of 
their participation in the experiment. 

The experimental results consist of the meas- 
ures of attractiveness, i.e., the preferential rank- 
ings of the tasks made before and after the task 
was performed. The questionnaires provide 
additional data. 


REsuLTs AND Discussion 


I. Analysis of Attractiveness Rankings 
of the Critical Task 


The nine tasks were ranked in order 
of attractiveness by each S. The 
first rankings, made at the outset of 
the experiment, determined the task 
to be performed by each S. This 
task, which in each case was the task 
given the mid-rank of five, is hereafter 
referred to as the critical task. 
Changes in the attractiveness of the 
critical task were determined by ob- 
serving its position in the second 
rankings, a value smaller than five 
indicating an increase in attractive- 
ness, a value larger than five, a de- 
crease in attractiveness. This sec- 
ond ranking was made immediately 
after the performance of the critical 
task, and is a measure of the effect of 
the experimental variables upon the 
attractiveness of the task. Table II 
gives the means of the second rank- 
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TABLE II 


Mean Rank oF ATTRACTIVENESS OF THE 
Critica, Task on Nine-Point Scate 
Arrer PERFORMANCE UNDER Eacu 
EXPERIMENTAL CONDITION 








Experience Success | Experience Failure 





Expect 
Failure 


Expect 


Expect | Expect 
Success 


Success | Failure 





Second 


Rankings 3.6 3-3 6.4 43 

















ings of the critical task for the 10 Ss 
in each of the four experimental con- 
ditions. 

It is to be observed that there is an 
overall tendency for the second rank- 
ings to indicate increased attractive- 
ness of the critical task, the mean of 
all second rankings being 4.4 as com- 
pared with the original rankings of 5. 
However, this difference between the 
rankings fails to meet an adequate 
criterion of statistical reliability as 


a t-test yields a P of >.05. 

The mean ranks under the two 
conditions of expectation show that the 
attractiveness of the critical task in- 
creased under the condition of ex- 
pected failure and both increased and 


decreased under the condition of 
expected success (cf. 3.3 and 4.3 with 
3.6 and 6.4). The mean ranks under 
the two conditions of experience show 
that the attractiveness of the critical 
task increased under the condition of 
experienced success, and decreased or 
increased relatively less under the 
condition of experienced failure (cf. 
3-6 and 3.3 with 6.4 and 4.3). An 
analysis of variance was made to 
determine the statistical reliability of 
these differences. In terms of experi- 
mental variables, the analysis in- 
dicated that the critical task was sig- 
nificantly more attractive if failure 
had been expected than if success had 
been expected (P <.05). It also 
indicated that the critical task was 
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significantly more attractive if suc- 
cess had been experienced than if 
failure had been experienced (P < 
.o1). No significant interaction be- 
tween expectation and experience ap- 
peared. 

Attractiveness rose when experi- 
enced success followed expected fail- 
ure and fell when experienced failure 
followed expected success (cf. 3.3 and 
6.4 or, expressed in mean changes from 
the original ranking of 5, cf. +1.7 
and —1.4); this confirms Prediction 
1. Attractiveness rose with expected 
and experienced success and also rose 
somewhat with expected and experi- 
enced failure (cf. 3.6 and 4.3 or, ex- 
pressed as mean changes from the 
original rankings of 5, cf. +1.4 and 
+0.7); this partially confirms Predic- 
tion 2. Under the two conditions in 
which expectation and experience are 
in contrast, expectation of failure fol- 
lowed by the experience of success, 
and expectation of success followed 
by the experience of failure, the mean 
changes in attractiveness are greater 
in consistent direction than when ex- 
pectation and experience are in agree- 
ment, both of success or both of 
failure (cf. 3.3 and 6.4 with 3.6 and 
4-3, Or, expressed as mean changes 
from the original ranking of 5, cf. +1.7 
and —1.4 with +1.4 and +0.7). 
It is apparent that the two conditions 
of contrasted expectation and experi- 
ence result in mean changes of at- 
tractiveness which are strikingly dif- 
ferent. A difference of 3.1 ranks 
separates the two means, which, when 
submitted to a t-test, is sufficiently 
large to yield a P-value of <.o1. 
The conditions of consistent expecta- 
tion and experience resulted in mean 
changes only 0.7 ranks apart. Thus, 
Prediction 3, also, is confirmed. 

In reference to Table I it is to be 
observed that the two experimental 
conditions in which expectation con- 
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trasted with experience are the two 
conditions in which it was predicted 
that actual feelings of success and 
failure would be strongest and would 
result in most marked changes in at- 
tractiveness The two conditions in 
which expectation and experience are 
in agreement are those in which it was 
predicted that actual feelings of suc- 
cess and failure would be less intense 
and would result in less marked 
changes in attractiveness. The aver- 
age changes in attractiveness in the 
four conditions are in the rank order 
assigned them on the basis of pre- 
dicted actual experience. (Cf. Rise, 
Rise, Fall, Fall? [Table I] with 1.4, 
1.7, —1.4, 0.7 [Table II].) It is 
interesting to note that the condition 
of expected and experienced failure, 
where failure could be interpreted 
legitimately as the outcome of adverse 
situational factors, has actually re- 
sulted in a slight average increase of 
attractiveness. This is consistent 
with the findings of the previous 
experiment. A _ situation in which 
failure could be rejected freely has 
eventuated in the task being ranked as 
somewhat more attractive than orig- 
inally. Of equal interest is the fact 
that success, which could be inter- 
preted as the outcome of favorable 
situational factors, has resulted in 
average increases of attractiveness 
which are only slightly less than those 
accompanying that success which 
appears to be a function of exceptional 
personal ability. Lewin et al. (2) 
have noted that the severing of the 
link between the result and the indi- 
vidual is more frequent after poor 
than after good achievement. 
Changes in the attractiveness of the 
performed activity suggest that such 
differential acceptance or rejection 
of achievement has, in fact, occurred. 
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II. Analysis of General and Personal 
Interest Ratings of Critical Task 


Two questionnaires which were 
answered by all Ss required ratings of 
the interest value of the task, and of 
feelings of success or failure, future 
expectations of success or failure, and 
satisfaction with performance. The 
questionnaires were employed in both 
the previous and the present experi- 
ments to support the verbal instruc- 
tions used to establish the condition 
of little need, and to serve as a partial 
check on the effective induction of 
the experimental variables. 

In the first questionnaire Ss were 
required to rate the critical task as to 
its interest for people in general and 
for themselves personally. On the 
seven-point scale, a value smaller than 
four indicated a judgment of inter- 
esting, a value larger than four, a 
judgment of uninteresting. In the 
upper part of Table III are given the 
mean ratings of general and personal 
interest for each of the four experi- 
mental conditions. The results are 
comparable to those obtained in the 
previous experiment under the con- 
dition of little need. (1) There is a 
consistent tendency to rate the task 


| ‘TABLE III 


Mean Ratincs on SEvEN-Point SCALE OF 
GENERAL AND PERSONAL INTEREST IN CRITICAL 
Task, AND OF Experience, Future Expecta- 
TION AND SATISFACTION AFTER PERFORMANCE 
UNDER Each ExpeRIMENTAL CONDITION 








Experience 


Experience 
Success 


Failure 





Expect 
Failure 


Expect 
Success 


Expect 
Failure 


Expect 
Success 





General 
Interest 

Personal P 
Interest . ‘S 


2.6 





Experience . 2.9 
Expectation A 2.6 
Satisfaction ; 2.1 
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TABLE IV 


PRoBABILITIES FROM ANALYSIS OF VARIANCE: INTEREST AND VARIABLE INDUCTION RaTINGs 
AFTER PERFORMANCE OF CRITICAL TASK 








Ratings of 


Ratings of 





General 
Interest 


Personal 
Interest 


Experience Expectation Satisfaction 





Experience 
Expectation 
Exper. X Expect. 


> .05 
>.05 
>.05 








<.05 
>.05 
>.05 


<.o1 
<.01 
> .05 


<.01 
> .0§ 
>.05 


<.01 
<.01 
>.05 








as interesting, no single average of 
general or personal interest being as 
large as, or larger than, 4. (2) The 
ratings of general interest are not 
differentiated by the experimental 
variables of experience and expecta- 
tion (cf. 2.5, 2.6, 2.4, 2.8). (3) In 
contrast, the ratings of personal 
interest appear to be a function of 
experimentally varied experience, the 
task being rated as having greater 
personal interest with experienced 
success than with experienced failure 


(cf. 2.7 and 2.5 with 3.7 and 3.3).5 
The analysis reported in Table IV 
confirms the statistical reliability of 
the effect of experience on ratings of 


personal interest. All other values 
are without significance. 


III. Analysis of Ratings of Experi- 


mental Variables 


The second questionnaire required 
Ss to rate their feelings of success or 
failure, future expectations of success 
or failure and satisfaction with their 
performance of the critical task. 
Again on a seven-point scale, a value 
smaller than four represents experi- 
ence or expectation of success, and 
satisfaction with performance; a value 
larger than four represents experience 
or expectation of failure, or dissatis- 
faction with performance. 

5There is some indication that ratings of 


personal interest are more favorable with ex- 
pected failure than with expected success. 


The mean ratings of experience, 
expectation and satisfaction, made by 
Ss in each of the four experimental 
conditions, are reported in the lower 
part of Table III. The principal ob- 
servations are these: (1) Ss under 
the condition of expected success con- 
sistently make less favorable ratings 
of experience, future expectation, and 
satisfaction than those under the 
condition of expected failure (cf. 3.7, 
5.5 with 2.9 and 4.5; 2.8 and 4.2 with 
2.6 and 3.5; 3.2 and 5.2 with 2.1 and 
3.9, respectively). (2) Ss under the 
condition of experienced success con- 
sistently make more favorable ratings 
of experience, future expectation and 
satisfaction than those under the 
condition of experienced failure (cf. 
3.7 and 2.9 with 5.5 and 4.5; 2.8 and 
2.6 with 4.2 and 3.5; 3.2 and 2.1 with 
5.2 and 3.9). From the P-values 
reported in Table IV it can be seen 
that all but one of these differences are 
statistically reliable (P <.o1). The 
single exception is the ratings of future 
expectation as affected by the variable 
of prior expectation; this value closely 
approaches but does not reach the .05 
level of confidence. (3) In Table I 
the four experimental conditions were 
analyzed as to the feelings of success or 
failure they would induce. Four 
different degrees of feeling, varying in 
direction and intensity, were pre- 
dicted. If the four degrees are con- 
sidered in rank order from strongest 
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feeling of success to strongest feeling 
of failure, it is to be noted that this 
same rank order not only appears in 
the ratings of experience but also in 
the ratings of future expectations of 
success or failure, the ratings of 
satisfaction with performance, and 
the ratings of personal interest of the 
performed task. ‘The most favorable 
ratings of experience, future expecta- 
tion, satisfaction, and personal inter- 
est occur under the condition of ex- 
pected failure and experienced suc- 
cess; in second rank, in each case, is 
expected and experienced success; in 
third place, expected and experienced 
failure; and, in fourth place, expected 
success and experienced failure. 

Although these ratings are not com- 
pletely conclusive evidence that the 
experimental procedure did produce 
the predicted feelings of success and 
failure, they do suggest that the four 
degrees of feeling did exist as hy- 
pothesized. 


SUMMARY AND CONCLUSIONS 


This study was designed to deter- 
mine the effect of pre-performance 
expectation of success and failure 
upon the attractiveness of activities. 
Women college students were asked to 
rank a series of tasks in order of 
attractiveness and each performed the 
task to which she had given the 
median rank of attractiveness. Pre- 
performance expectations of success 
or failure and experiences of success 
or failure were varied experimentally. 
Any consequent changes in attractive- 
ness of the performed task were deter- 
mined through a second ranking of the 
tasks. Ss made ratings of the interest 
value of the performed task and of 
their feelings of success or failure, 
future expectations of success or 
failure, and satisfaction with results. 

The major conclusions are sum- 
marized below. 


MILDRED E. GEBHARD 


1. Attractiveness rose when experi- 
enced success followed expected fail- 
ure and fell when experienced failure 
followed expected success. 

2. Attractiveness rose when experi- 
enced success followed expected suc- 
cess and also rose somewhat when 
experienced failure followed expected 
failure. 

3. Average changes in attractive- 
ness were greater when expectation 
aud experience were in contrast, one 
of success and the other of failure, 
than when expectation and experience 
were similar, both of success or both of 
failure. 

4- Ratings of personal interest in 
the task and ratings of feelings of 
success or failure, future expectations 
of success or failure, and satisfaction 
with results, were more favorable with 
pre-performance expectations of fail- 
ure and with experience of success than 
with pre-performance expectations of 
success and with experience of failure. 

5. The four predicted degrees of 
feelings of success and failure, with 
corresponding changes in the at- 
tractiveness of the performed activity, 
were supported by the second rankings 
of attractiveness, the ratings of per- 
sonal interest in the task, and the 
ratings of feelings of success or failure, 
expectation of success or failure, and 
satisfaction with results. 


APPENDIX 


The procedure reported in the previous ex- 
periment was subdivided as follows: introduc- 
tion, description of tasks, first ranking of tasks, 
establishment of need, beginning of task, ending 
of task, induction of experience and expectation 
of success and failure, transition to second rank- 
ing, second ranking of tasks, questionnaires, con- 
clusion. The previous and present procedures 
are identical with these exceptions: (1) The in- 
structions accompanying the condition of strong 
need were eliminated from (a) the “establish- 
ment of need” and (b) the “ending of task.” 
The instructions accompanying the little need 
condition were retained. (2) As expectations 
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were varied experimentally before performance started. Each card has a different number. 
instead of after performance, the post-perform- As you do each one always write down the 
ance expectation instructions were eliminated number of the card and put your answer beside 
and the “beginning of task” instructions were it. Allright? Start. 

changed to the following: 


You’re having a lucky break (bad break). 
This is one of the tasks that has been going 
well (giving trouble). It’s rather easy (diffi- 
cult). (Pause) I’m going to give you ten of 
these cards. I’m going to stop you after a 
while so I’ll need a stop watch. (E£ with cards 
in hand stepped to shelves in back of S. A 
stop watch was picked up and a set of 10 cards 
of the same type but of prearranged difficulty 
were slipped on the top of the pack still in the 
E’s hands. E returned to the table and REFERENCES 
counted off the ten cords from the top of the 1. Gepuarp, Mitprep E. The effect of success 
pack. £ casually turned the cards over and 4 fail h é f 
looked at the symbols printed on their backs.) a See Gs Se a ¢ 
My word! (Pause) You are having a lucky activities as a function of experience, ex- 
break (bad break). (E turned the cards face pectation and need. J. exp. Psychol., 
down and spread them out.) These symbols 1948, 38, 371-388. nak 
indicate how difficult the problem is. This 2: Lewin, K., et av. Level of aspiration. In: 
symbol indicates that the problem is one of J. McV. Hunt (Ed.), Personality and the 
the easier ones, this symbol indicates that the behavior disorders. N. Y.: Ronald Press, 
problem is one of the more difficultones. You 1944. 
can see that most of yours are easy (difficult) 3. Rice, P. B. The ego and the law of effect. 
ones, many more than is usual. Well, let’s get Psychol. Rev., 1946, 53, 307-320. 


The minor changes in the wording of the in- 
structions for inducing pre-performance expecta- 
tions seemed necessary to insure expectations 
which would match the intensity of the post- 
performance expectations of the previous ex- 
periment. 


(Manuscript received for immediate 
publication February 28, 1949) 
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1 
By subscription, $7.00 
List price, Volumes 1 through 38 
_ 30% discount 
Net price, Volumes 1 through 38 $173.25 


Information about the Journal of Experimental Psychology: as noted in the table, the 
journal was not published in 1918 and 1919; from 1937 to 1943 it was published at the rate of 
two volumes per year. Eighteen numbers affecting eight volumes are out of print in the en- 
tire series. 


Information about prices: when all numbers are available, the price per volume throughout 
the series is $7.00. For incomplete volumes, the price is $1.25 for each available number. 
For foreign postage, $.25 per volume should be added. The American Psychological Association 
gives the following discounts on orders for any one journal: 


10% on orders of $ 50.00 and over 
20% on orders of $100.00 and over 
30% on orders of $150.00 and over 


Address orders and subscriptions to 


AMERICAN PSYCHOLOGICAL ASSOCIATION 
1515 Massachusetts Avenue, N.W. Washington 5, D. C. 





